Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
Lustre 2.5.3
-
3
-
9223372036854775807
Description
On all of our filesystems, the following error message is extremely common:
LustreError: 8746:0:(ost_handler.c:1776:ost_blocking_ast()) Error -2 syncing data on lock cancel
There is nothing else in the logs that gives any hint as to why this message is appearing.
Our filesystems all use osd-zfs, and we are currently running Lustre 2.5.3-5chaos (see github.com/chaos/lustre).
If this is a symptom of a bug, then please fix it. If this is not a symptom of a bug, then please stop scaring our system administrators with this message.
Attachments
Issue Links
- is duplicated by
-
LU-7007 (ost_handler.c:1779:ost_blocking_ast()) Error -2 syncing data on lock cancel
-
- Resolved
-
- is related to
-
LU-7308 LustreError: 16956:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel
-
- Resolved
-
- is related to
-
LU-5805 tgt_recov blocked and "waking for gap in transno"
-
- Resolved
-
Just as I was reviewing this patch (the comments are n the patch), I just remembered that LLNL had suspicions of double referenced objects in the past (
LU-5648) where same object was potentially referenced twice (that was never confirmed, though).So having objects owned by two files would most likely lead to this message too.
I wonder if lfsck in 2.5.4 is already in a good enough shape to be able to detect that.