[LU-1311] Debugging deadlock in lnet Created: 12/Apr/12 Updated: 31/May/12 Resolved: 31/May/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.2.0, Lustre 2.3.0 |
| Fix Version/s: | Lustre 2.3.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Oleg Drokin | Assignee: | Oleg Drokin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 4631 |
| Description |
|
ORNL reported this lockup [ 446.217881] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 2 [ 446.217881] Pid: 9709, comm: kiblnd_sd_05 Not tainted 2.6.32-220.el6.wc.x86_64 #1 [ 446.217881] Call Trace: [ 446.217881] <NMI> [<ffffffff814ec681>] ? panic+0x78/0x143 [ 446.217881] [<ffffffff810d8fad>] ? watchdog_overflow_callback+0xcd/0xd0 [ 446.217881] [<ffffffff8110a89d>] ? __perf_event_overflow+0x9d/0x230 [ 446.217881] [<ffffffff8110ae54>] ? perf_event_overflow+0x14/0x20 [ 446.217881] [<ffffffff8101e096>] ? intel_pmu_handle_irq+0x336/0x550 [ 446.217881] [<ffffffff814f2256>] ? kprobe_exceptions_notify+0x16/0x430 [ 446.217881] [<ffffffff814f0d39>] ? perf_event_nmi_handler+0x39/0xb0 [ 446.217881] [<ffffffff814f2885>] ? notifier_call_chain+0x55/0x80 [ 446.217881] [<ffffffff814f28ea>] ? atomic_notifier_call_chain+0x1a/0x20 [ 446.217881] [<ffffffff81096bce>] ? notify_die+0x2e/0x30 [ 446.217881] [<ffffffff814f0503>] ? do_nmi+0x173/0x2b0 [ 446.217881] [<ffffffff814efe10>] ? nmi+0x20/0x30 [ 446.217881] [<ffffffff814ef67c>] ? _spin_lock+0x1c/0x30 [ 446.217881] <<EOE>> [<ffffffffa07473bf>] ? cfs_trace_lock_tcd+0x2f/0x90 [libcfs] [ 446.217881] [<ffffffffa0750b0a>] ? libcfs_debug_vmsg2+0xda/0xb60 [libcfs] [ 446.217881] [<ffffffffa07515d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [ 446.217881] [<ffffffffa07515d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [ 446.217881] [<ffffffffa0b5768d>] ? kiblnd_scheduler+0x16d/0x620 [ko2iblnd] [ 446.217881] [<ffffffff8105fa50>] ? default_wake_function+0x0/0x20 [ 446.217881] [<ffffffffa0b57520>] ? kiblnd_scheduler+0x0/0x620 [ko2iblnd] [ 446.217881] [<ffffffff8100c14a>] ? child_rip+0xa/0x20 [ 446.217881] [<ffffffffa0b57520>] ? kiblnd_scheduler+0x0/0x620 [ko2iblnd] [ 446.217881] [<ffffffffa0b57520>] ? kiblnd_scheduler+0x0/0x620 [ko2iblnd] [ 446.217881] [<ffffffff8100c140>] ? child_rip+0x0/0x20 the other thread is in irq context: PID: 9855 TASK: ffff880431898b40 CPU: 0 COMMAND: "ktracefiled" #0 [ffff880028207e90] crash_nmi_callback at ffffffff81029716 #1 [ffff880028207ea0] notifier_call_chain at ffffffff814f2885 #2 [ffff880028207ee0] atomic_notifier_call_chain at ffffffff814f28ea #3 [ffff880028207ef0] notify_die at ffffffff81096bce #4 [ffff880028207f20] do_nmi at ffffffff814f0503 #5 [ffff880028207f50] nmi at ffffffff814efe10 [exception RIP: _spin_lock_irqsave+47] RIP: ffffffff814ef56f RSP: ffff880028203de8 RFLAGS: 00000097 RAX: 00000000000055c2 RBX: ffff88040df2b800 RCX: 00000000000055c1 RDX: 0000000000000046 RSI: ffff88040df2b800 RDI: ffffffffa0b71c30 RBP: ffff880028203de8 R8: ffff8803fbb04000 R9: ffff8803fbb05de0 R10: ffff880028403a40 R11: 0000000000000000 R12: 0000000000000087 R13: 0000000000000000 R14: ffff8804318da000 R15: ffff8803fbb05f00 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #6 [ffff880028203de8] _spin_lock_irqsave at ffffffff814ef56f #7 [ffff880028203df0] kiblnd_cq_completion at ffffffffa0b51548 [ko2iblnd] <== this is line 3277 (cfs_spin_lock_irqsave(&kiblnd_data.kib_sched_lock, flags);) in o2iblnd_cb.c #8 [ffff880028203e10] mthca_cq_completion at ffffffffa019ca4a [ib_mthca] #9 [ffff880028203e30] mthca_eq_int at ffffffffa019c422 [ib_mthca] #10 [ffff880028203eb0] mthca_arbel_msi_x_interrupt at ffffffffa019c694 [ib_mthca] #11 [ffff880028203ed0] handle_IRQ_event at ffffffff810d94f0 #12 [ffff880028203f20] handle_edge_irq at ffffffff810dbc2e #13 [ffff880028203f60] handle_irq at ffffffff8100df09 #14 [ffff880028203f80] do_IRQ at ffffffff814f504c --- <IRQ stack> --- #15 [ffff8803fbb05d38] ret_from_intr at ffffffff8100ba53 [exception RIP: put_pages_on_tcd_daemon_list+255] RIP: ffffffffa075072f RSP: ffff8803fbb05de0 RFLAGS: 00000246 RAX: ffffea000c9a82a8 RBX: ffff8803fbb05e20 RCX: ffff880390407b88 RDX: ffff88040d080128 RSI: ffff880390407b48 RDI: ffff880390407b68 RBP: ffffffff8100ba4e R8: ffff88040d080128 R9: 0000000000000000 R10: ffff880028403a40 R11: 0000000000000000 R12: ffff8803fbb05d70 R13: ffff8803fbb05f00 R14: ffff8803fbb05ef0 R15: 0000000000000286 ORIG_RAX: ffffffffffffff46 CS: 0010 SS: 0018 #16 [ffff8803fbb05e28] put_pages_on_daemon_list at ffffffffa075094b [libcfs] <== this is line 575 (cfs_tcd_for_each_type_lock) in tracefile.c #17 [ffff8803fbb05e78] tracefiled at ffffffffa0751d4f [libcfs] #18 [ffff8803fbb05f48] kernel_thread at ffffffff8100c14a |
| Comments |
| Comment by Oleg Drokin [ 12/Apr/12 ] |
|
tracefiled thread: lock some random tcd (nonirq type) |
| Comment by Liang Zhen (Inactive) [ 13/Apr/12 ] |
|
Oleg, due to your analysis, any call of CDEBUG between spin_lock/unlock_irqsave, or spin_lock/unlock_bh could trigger this deadlock. I think a fix could be like this:
#define cfs_trace_lock_tcd_safe(tcd, flags)
({
int __ret;
spin_lock_irqsave(&cfs_trace_glock, flags);
__ret = cfs_trace_lock_tcd(tcd);
__ret;
})
#define cfs_trace_unlock_tcd_safe(tcd, flags)
({
cfs_trace_unlock_tcd(tcd);
spin_unlock_irqrestore(&cfs_trace_glock, flags);
})
#define cfs_tcd_for_each_type_lock(..., flags)
......
cfs_trace_lock_tcd_safe(tcd, fags); cfs_trace_unlock_tcd(tcd, flags), i++)
how do you think? Liang |
| Comment by Peter Jones [ 16/Apr/12 ] |
|
Oleg will construct a patch |
| Comment by Oleg Drokin [ 27/Apr/12 ] |
|
Patch at http://review.whamcloud.com/2605 |
| Comment by James A Simmons [ 30/Apr/12 ] |
|
I now get this error when testing with the patch. Apr 30 08:19:54 barry-oss4 kernel: [247732.876333] Lustre: debug daemon will attempt to start writing to /chexport/lustre/logs/lustre-log.barry-oss4 (2097152kB m The load on all OSS goes to 300+. |
| Comment by Peter Jones [ 31/May/12 ] |
|
Landed for 2.3 |