Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15819

Executables run from Lustre may receive spurious SIGBUS signals

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • Upstream
    • None
    • 3
    • 9223372036854775807

    Description

      We received several reports about applications (IOR and other unrelated user-provided programs) started from Lustre receiving SIGBUS signals.

       

      We were able to reproduce the issue with IOR, RHEL7 kernel on the client.

       

      It seems that it is caused by LU-14541 and the mechanics is the following:

       

      1) a major fault in the IOR code happens

      2) ll_fault()>...>filemap_fault()

      3) ll_readpage() is issued from filemap_fault()

      4) wait_on_page_locked() is issued from filemap_fault()

      5) the uptodate check in filemap_fault() fails due to a parallel ClearPageUptodate() called from a blocking AST handler

      6) VM_FAULT_SIGBUS is returned

      Attachments

        Issue Links

          Activity

            People

              panda Andrew Perepechko
              panda Andrew Perepechko
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: