Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
Lustre 2.12.5
-
RHEL7 server nodes running 2.12.5 LTS.
-
3
-
9223372036854775807
Description
When setting max_rpc_in_flight to 256 the MDS crashed with the following back trace.
[3072807.665012] LustreError: 106301:0:(ldlm_lockd.c:1543:ldlm_handle_convert0()) Skipped 6 previous similar messages
[3072920.767949] LustreError: 107784:0:(ldlm_lockd.c:1543:ldlm_handle_convert0()) ### convert on canceled lock! ns: mdt-storm-MDT0000_UUID lock: ffff8fbfd69a2
400/0x8f43eb98e65eb06e lrc: 3/0,0 mode: PR/PR res: [0x20000560c:0x9f09:0x0].0x0 bits 0x58/0x0 rrc: 4 type: IBT flags: 0x54a01400010020 nid: 10.134.129.9@tcp55
remote: 0xc1b65128fa6df589 expref: 31059 pid: 154261 timeout: 3080537 lvb_type: 0
[3072920.805945] LustreError: 107784:0:(ldlm_lockd.c:1543:ldlm_handle_convert0()) Skipped 4 previous similar messages
[3072929.398817] LustreError: 106301:0:(ldlm_lock.c:1106:ldlm_grant_lock_with_skiplist()) ASSERTION( ldlm_is_granted(lock) ) failed:
[3072929.412226] LustreError: 106301:0:(ldlm_lock.c:1106:ldlm_grant_lock_with_skiplist()) LBUG
[3072929.421404] Pid: 106301, comm: ldlm_cn00_002 3.10.0-1127.13.1.el7.x86_64 #1 SMP Fri Jun 12 14:34:17 EDT 2020
[3072929.432225] Call Trace:
[3072929.435691] [<ffffffffc282a7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[3072929.443252] [<ffffffffc282a87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[3072929.450458] [<ffffffffc164fa87>] ldlm_grant_lock_with_skiplist+0x607/0x750 [ptlrpc]
[3072929.459259] [<ffffffffc1682d0a>] ldlm_inodebits_drop+0xaa/0x170 [ptlrpc]
[3072929.467092] [<ffffffffc167b3fb>] ldlm_handle_convert0+0x2db/0x460 [ptlrpc]
[3072929.475080] [<ffffffffc167bacb>] ldlm_cancel_handler+0x29b/0x590 [ptlrpc]
[3072929.482957] [<ffffffffc16ae48b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[3072929.491613] [<ffffffffc16b1df4>] ptlrpc_main+0xb34/0x1470 [ptlrpc]
[3072929.498873] [<ffffffff930c6691>] kthread+0xd1/0xe0
[3072929.504710] [<ffffffff93792d1d>] ret_from_fork_nospec_begin+0x7/0x21
[3072929.512100] [<ffffffffffffffff>] 0xffffffffffffffff
[3072929.518025] Kernel panic - not syncing: LBUG
[3072929.523194] CPU: 1 PID: 106301 Comm: ldlm_cn00_002 Kdump: loaded Tainted: P OE ------------ T 3.10.0-1127.13.1.el7.x86_64 #1
[3072929.536964] Hardware name: Dell Inc. PowerEdge R640/0RGP26, BIOS 2.3.10 08/15/2019
[3072929.545412] Call Trace:
[3072929.548751] [<ffffffff9377ffa5>] dump_stack+0x19/0x1b
[3072929.554758] [<ffffffff93779541>] panic+0xe8/0x21f
[3072929.560410] [<ffffffffc282a8cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
[3072929.567463] [<ffffffffc164fa87>] ldlm_grant_lock_with_skiplist+0x607/0x750 [ptlrpc]
[3072929.576066] [<ffffffffc1682d0a>] ldlm_inodebits_drop+0xaa/0x170 [ptlrpc]
[3072929.583705] [<ffffffffc167b3fb>] ldlm_handle_convert0+0x2db/0x460 [ptlrpc]
[3072929.591502] [<ffffffffc167bacb>] ldlm_cancel_handler+0x29b/0x590 [ptlrpc]
[3072929.599199] [<ffffffffc16ae48b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[3072929.607671] [<ffffffffc16ab2a5>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[3072929.615245] [<ffffffff930d3dc3>] ? __wake_up+0x13/0x20
[3072929.621272] [<ffffffffc16b1df4>] ptlrpc_main+0xb34/0x1470 [ptlrpc]
[3072929.628307] [<ffffffff93785942>] ? __schedule+0x402/0x840
Attachments
Issue Links
- is related to
-
LU-11276 racer: mdc_dev.c:1346:mdc_req_attr_set()) uncovered page
- Resolved