Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.7.0
-
3
-
15132
Description
If a secondary MDT tries to use a FID with a bogus sequence then the handler will loop forever in fld_client_rpc():
[ 9667.196083] LustreError: 28865:0:(fld_handler.c:261:fld_server_lookup()) srv-lustre-MDT0000: Cannot find sequence 0x4000002c0000401: rc = -2 [ 9667.198478] LustreError: 28865:0:(fld_handler.c:261:fld_server_lookup()) Skipped 167399 previous similar messages [ 9699.057201] LNet: 4534:0:(watchdog.c:200:lcw_dump_stack()) Service thread pid 17054 was inactive for 62.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 9699.061160] Pid: 17054, comm: mdt01_011 17054 mdt01_011 [<ffffffffa06821fa>] ptlrpc_set_wait+0x2ea/0x830 [ptlrpc] [<ffffffffa06827c7>] ptlrpc_queue_wait+0x87/0x220 [ptlrpc] [<ffffffffa087455b>] fld_client_rpc+0x15b/0x4b0 [fld] [<ffffffffa0879c81>] fld_server_lookup+0x151/0x340 [fld] [<ffffffffa0d6f567>] lod_fld_lookup+0x1e7/0x350 [lod] [<ffffffffa0d81b63>] lod_object_init+0x103/0x3c0 [lod] [<ffffffffa0455b98>] lu_object_alloc+0xd8/0x320 [obdclass] [<ffffffffa045718f>] lu_object_find_at+0x2bf/0x410 [obdclass] [<ffffffffa04572f6>] lu_object_find+0x16/0x20 [obdclass] [<ffffffffa0c95f56>] mdt_object_find+0x56/0x170 [mdt] [<ffffffffa0ccbe71>] mdt_reint_open+0x2e1/0x2180 [mdt] [<ffffffffa0cb2811>] mdt_reint_rec+0x41/0xe0 [mdt] [<ffffffffa0c9cdb3>] mdt_reint_internal+0x4d3/0x7b0 [mdt] [<ffffffffa0c9d286>] mdt_intent_reint+0x1f6/0x520 [mdt] [<ffffffffa0c9b929>] mdt_intent_policy+0x499/0xcf0 [mdt] [<ffffffffa0644342>] ldlm_lock_enqueue+0x302/0x880 [ptlrpc] [<ffffffffa066c343>] ldlm_handle_enqueue0+0x373/0x1130 [ptlrpc] [<ffffffffa06eb592>] tgt_enqueue+0x62/0x1d0 [ptlrpc] [<ffffffffa06eacbe>] tgt_request_handle+0x71e/0xb10 [ptlrpc] [<ffffffffa069d847>] ptlrpc_main+0xd47/0x1860 [ptlrpc] [<ffffffff8109eab6>] kthread+0x96/0xa0 [<ffffffff8100c30a>] child_rip+0xa/0x20 [<ffffffffffffffff>] 0xffffffffffffffff
This was found through RPC corruption.