Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.4.0
-
None
-
3
-
6625
Description
Hit this in racer, but I suspect it will hit in other workloads when hpreqs are happening.
[ 3577.720282] LustreError: 12694:0:(service.c:1512:ptlrpc_server_hpreq_init()) A SSERTION( rc == 0 || rc == 1 ) failed: [ 3577.720820] LustreError: 12694:0:(service.c:1512:ptlrpc_server_hpreq_init()) L BUG [ 3577.721172] Pid: 12694, comm: ll_ost_io00_007 [ 3577.721320] [ 3577.721320] Call Trace: [ 3577.721675] [<ffffffffa0449915>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [ 3577.721976] [<ffffffffa0449f17>] lbug_with_loc+0x47/0xb0 [libcfs] [ 3577.722287] [<ffffffffa079dc9a>] ptlrpc_main+0x15ca/0x17f0 [ptlrpc] [ 3577.722591] [<ffffffffa079c6d0>] ? ptlrpc_main+0x0/0x17f0 [ptlrpc] [ 3577.722882] [<ffffffff8100c14a>] child_rip+0xa/0x20 [ 3577.723169] [<ffffffffa079c6d0>] ? ptlrpc_main+0x0/0x17f0 [ptlrpc] [ 3577.723473] [<ffffffffa079c6d0>] ? ptlrpc_main+0x0/0x17f0 [ptlrpc] [ 3577.723758] [<ffffffff8100c140>] ? child_rip+0x0/0x20 [ 3577.724014] [ 3577.762250] Kernel panic - not syncing: LBUG
The rc value is 2.
The culprit seems to be ost_rw_hpreq_check returning number of locks matched instead of 1 to show the matching happened and NRS changes actually demand either 0 or 1 return value
patch landed