Details
-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
Lustre 2.11.0
-
soak performance cluster
-
3
-
9223372036854775807
Description
MDS is rebooted (single MDS, no DNE)
MDS goes into recovery, with bogus values for recovery timer.
soak-8 login: [ 1393.056450] Lustre: soaked-MDT0000: Denying connection for new client 7af6eae0-3527-5481-d01d-161d271e4510(at 192.168.1.142@o2ib), waiting for 29 known clients (6 recovered, 21 in progress, and 2 evicted) to recover in 71565:2
MDS never exits recovery, clients get -EBUSY.
Attempting to abort_recovery causes timeouts, system still wedged.
1681.193209] INFO: task lctl:2555 blocked for more than 120 seconds.^M [ 1681.271617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.^M [ 1681.368730] lctl D ffff8803f1ecd400 0 2555 2526 0x00000084^M [ 1681.456456] ffff880413a5bc10 0000000000000082 ffff8803f1826eb0 ffff880413a5bfd8^M [ 1681.548186] ffff880413a5bfd8 ffff880413a5bfd8 ffff8803f1826eb0 ffff8808195014d0^M [ 1681.639847] 7fffffffffffffff ffff8808195014c8 ffff8803f1826eb0 ffff8803f1ecd400^M [ 1681.731520] Call Trace:^M [ 1681.763370] [<ffffffff816a9589>] schedule+0x29/0x70^M [ 1681.826052] [<ffffffff816a7099>] schedule_timeout+0x239/0x2c0^M [ 1681.899089] [<ffffffff816a993d>] wait_for_completion+0xfd/0x140^M [ 1681.974192] [<ffffffff810c4820>] ? wake_up_state+0x20/0x20^M [ 1682.044159] [<ffffffffc10f5a5d>] target_stop_recovery_thread.part.16+0x3d/0xd0 [ptlrpc]^M [ 1682.144235] [<ffffffffc10f5b08>] target_stop_recovery_thread+0x18/0x20 [ptlrpc]^M [ 1682.235915] [<ffffffffc15935d0>] mdt_iocontrol+0x550/0xaf0 [mdt]^M [ 1682.312024] [<ffffffffc0ef3bd9>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]^M [ 1682.400553] [<ffffffffc0edebb3>] class_handle_ioctl+0x1913/0x1da0 [obdclass]^M [ 1682.488997] [<ffffffff812b1a98>] ? security_capable+0x18/0x20^M [ 1682.561806] [<ffffffffc0ec4602>] obd_class_ioctl+0xd2/0x170 [obdclass]^M [ 1682.643909] [<ffffffff812151bd>] do_vfs_ioctl+0x33d/0x540^M [ 1682.712431] [<ffffffff816b0091>] ? __do_page_fault+0x171/0x450^M [ 1682.786103] [<ffffffff81215461>] SyS_ioctl+0xa1/0xc0^M [ 1682.849308] [<ffffffff816b5089>] system_call_fastpath+0x16/0x1b^M
Lustre-log, stack traces attached, we are currently forcing a kernel dump
Attachments
Issue Links
- mentioned in
-
Page Loading...