Details
Description
Communication between the MDS and OSS servers failed so recovery started (IR is disabled"). The recovery on the OSS server failed with a lock up:
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [ll_ost_io01_074:30838]
With stack trace:
[95181.855433] CPU: 3 PID: 30807 Comm: ll_ost_io01_070 Kdump: loaded Tainted: P OE ------------ T 3.10.0-1160.49.1.el7.x86_64 #1
[95181.855433] Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 1.6.13 12/17/2018
[95181.855434] task: ffff93abd41c6300 ti: ffff93abdc84c000 task.ti: ffff93abdc84c000
[95181.855435] RIP: 0010:[<ffffffff85f17aa2>]
[95181.855441] [<ffffffff85f17aa2>] native_queued_spin_lock_slowpath+0x122/0x200
[95181.855441] RSP: 0018:ffff93abdc84fcb8 EFLAGS: 00000246
[95181.855442] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000190000
[95181.855443] RDX: ffff93c4dd69b8c0 RSI: 0000000000290000 RDI: ffff93c367157830
[95181.855443] RBP: ffff93abdc84fcb8 R08: ffff93c4dd65b8c0 R09: 0000000000000000
[95181.855444] R10: ffff93c4dd65f160 R11: fffff40dfe6a9200 R12: ffff93abdc84fc58
[95181.855444] R13: ffff93c3b5c66000 R14: ffff93b8aaa89850 R15: ffffffffc17a7b96
[95181.855445] FS: 0000000000000000(0000) GS:ffff93c4dd640000(0000) knlGS:0000000000000000
[95181.855446] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[95181.855447] CR2: 000000c002f51000 CR3: 0000002fd1d42000 CR4: 00000000007607e0
[95181.855448] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[95181.855448] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[95181.855449] PKRU: 00000000
[95181.855449] Call Trace:
[95181.855454] [<ffffffff8657dcf3>] queued_spin_lock_slowpath+0xb/0xf
[95181.855459] [<ffffffff8658baa0>] _raw_spin_lock+0x20/0x30
[95181.855518] [<ffffffffc1463232>] ptlrpc_server_drop_request+0x1c2/0x6d0 [ptlrpc]
[95181.855545] [<ffffffffc14637d2>] ptlrpc_server_finish_active_request+0x92/0x140 [ptlrpc]
[95181.855572] [<ffffffffc1465a41>] ptlrpc_server_handle_request+0x401/0xab0 [ptlrpc]
[95181.855597] [<ffffffffc14626a5>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[95181.855600] [<ffffffff85ed3233>] ? __wake_up+0x13/0x20
[95181.855625] [<ffffffffc14691f4>] ptlrpc_main+0xb34/0x1470 [ptlrpc]
[95181.855650] [<ffffffffc14686c0>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[95181.855653] [<ffffffff85ec5e61>] kthread+0xd1/0xe0
[95181.855655] [<ffffffff85ec5d90>] ? insert_kthread_work+0x40/0x40
[95181.855657] [<ffffffff86595ddd>] ret_from_fork_nospec_begin+0x7/0x21
[95181.855659] [<ffffffff85ec5d90>] ? insert_kthread_work+0x40/0x40