Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15819

Executables run from Lustre may receive spurious SIGBUS signals

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Blocker Blocker
    • None
    • Upstream
    • None
    • 3
    • 9223372036854775807

      We received several reports about applications (IOR and other unrelated user-provided programs) started from Lustre receiving SIGBUS signals.

       

      We were able to reproduce the issue with IOR, RHEL7 kernel on the client.

       

      It seems that it is caused by LU-14541 and the mechanics is the following:

       

      1) a major fault in the IOR code happens

      2) ll_fault()>...>filemap_fault()

      3) ll_readpage() is issued from filemap_fault()

      4) wait_on_page_locked() is issued from filemap_fault()

      5) the uptodate check in filemap_fault() fails due to a parallel ClearPageUptodate() called from a blocking AST handler

      6) VM_FAULT_SIGBUS is returned

            panda Andrew Perepechko
            panda Andrew Perepechko
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: