Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
Lustre 2.4.1, Lustre 2.5.0
-
Lustre [2.4.0-RC1_3chaos|http://github.com/chaos/lustre/tree/2.4.0-RC1_3chaos], PPC64 client
-
3
-
9393
Description
We are hitting the LBUG in osc_ldlm_blocking_ast() that says "This should never happen". The backtrace looks like:
PID: 29660 TASK: c00000098d26bab0 CPU: 42 COMMAND: "ldlm_bl_44" #0 [c000000dc9f3f7c0] .crash_kexec at c0000000000e5aa4 #1 [c000000dc9f3f9c0] .panic at c0000000005c4f40 #2 [c000000dc9f3fa50] .lbug_with_loc at d00000000acd14e0 [libcfs] #3 [c000000dc9f3fae0] .osc_ldlm_blocking_ast at d00000000afbe1d8 [osc] #4 [c000000dc9f3fbc0] .ldlm_cancel_callback at d00000000bf3148c [ptlrpc] #5 [c000000dc9f3fc60] .ldlm_cli_cancel_local at d00000000bf4db78 [ptlrpc] #6 [c000000dc9f3fd20] .ldlm_cli_cancel_list_local at d00000000bf4f80c [ptlrpc] #7 [c000000dc9f3fe40] .ldlm_bl_thread_main at d00000000bf5c07c [ptlrpc] #8 [c000000dc9f3ff90] .kernel_thread at c000000000032fd4
Both times we have seen this (that I am aware of), the LBUG was preceded shortly before by a kernel page allocation error, so that may be part of the recipe for triggering this bug.
This is a PPC64 client running lustre 2.4.0-RC1_3chaos.