Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.8.0
-
Clients running Lustre-2.7.57 plus patches using an lustre 2.5.4 server back end.
-
3
-
9223372036854775807
Description
While running a simulated user work load our MDS crashed due to the following:
<3>[ 5913.638287] LustreError: 15340:0:(ldlm_lock.c:371:ldlm_lock_destroy_internal()) ### lock still on resource ns: mdt-atlas1-MDT0000_UUID lock: ffff883fd37b3700/0xb9db2cef5d1e41 39 lrc: 3/0,0 mode: CR/CR res: [0x200252cc7:0x3:0x0].0 bits 0x8 rrc: 2 type: IBT flags: 0x50000000000000 nid: 9310@gni100 remote: 0xd3f59509a44985ef expref: 60 pid: 15340 timeout: 0 lvb_type: 3 <0>[ 5913.674996] LustreError: 15340:0:(ldlm_lock.c:372:ldlm_lock_destroy_internal()) LBUG <4>[ 5913.683915] Pid: 15340, comm: mdt03_052 <4>[ 5913.688327] <4>[ 5913.688327] Call Trace: <4>[ 5913.692965] [<ffffffffa0430895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4>[ 5913.700886] [<ffffffffa0430e97>] lbug_with_loc+0x47/0xb0 [libcfs] <4>[ 5913.707968] [<ffffffffa0707871>] ldlm_lock_destroy_internal+0x251/0x2c0 [ptlrpc] <4>[ 5913.716573] [<ffffffffa07092b5>] ldlm_lock_destroy+0x35/0x130 [ptlrpc] <4>[ 5913.724118] [<ffffffffa070a311>] ldlm_lock_enqueue+0x161/0x980 [ptlrpc] <4>[ 5913.731760] [<ffffffffa0733e9b>] ldlm_handle_enqueue0+0x51b/0x10c0 [ptlrpc] <4>[ 5913.739797] [<ffffffffa0d6f1a6>] mdt_enqueue+0x46/0xe0 [mdt] <4>[ 5913.746362] [<ffffffffa0d7401a>] mdt_handle_common+0x52a/0x1470 [mdt] <4>[ 5913.753804] [<ffffffffa0db0615>] mds_regular_handle+0x15/0x20 [mdt] <4>[ 5913.761064] [<ffffffffa0762f55>] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc] <4>[ 5913.769866] [<ffffffffa0442785>] ? lc_watchdog_touch+0x65/0x170 [libcfs] <4>[ 5913.777632] [<ffffffffa075b929>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] <4>[ 5913.785377] [<ffffffffa07656dd>] ptlrpc_main+0xaed/0x1930 [ptlrpc] <4>[ 5913.792557] [<ffffffffa0764bf0>] ? ptlrpc_main+0x0/0x1930 [ptlrpc] <4>[ 5913.799689] [<ffffffff8109e78e>] kthread+0x9e/0xc0 <4>[ 5913.805270] [<ffffffff8100c28a>] child_rip+0xa/0x20 <4>[ 5913.810950] [<ffffffff8109e6f0>] ? kthread+0x0/0xc0 <4>[ 5913.816622] [<ffffffff8100c280>] ? child_rip+0x0/0x20