Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
None
-
Upstream
-
None
-
3
-
9223372036854775807
Description
We received several reports about applications (IOR and other unrelated user-provided programs) started from Lustre receiving SIGBUS signals.
We were able to reproduce the issue with IOR, RHEL7 kernel on the client.
It seems that it is caused by LU-14541 and the mechanics is the following:
1) a major fault in the IOR code happens
2) ll_fault()>...>filemap_fault()
3) ll_readpage() is issued from filemap_fault()
4) wait_on_page_locked() is issued from filemap_fault()
5) the uptodate check in filemap_fault() fails due to a parallel ClearPageUptodate() called from a blocking AST handler
6) VM_FAULT_SIGBUS is returned