[LU-2653] hang in recovery-small test 51 Created: 19/Jan/13 Updated: 26/Apr/17 Resolved: 26/Apr/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Oleg Drokin | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 6195 |
| Description |
|
I seem to have a somewhat frequent hang on lustre cleanup that looks like this: [46801.719874] INFO: task umount:16892 blocked for more than 120 seconds. [46801.720068] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [46801.720819] umount D 0000000000000000 2608 16892 16891 0x00000000 [46801.721019] ffff880041defa18 0000000000000086 ffff880041def9e0 ffff880041def9dc [46801.721325] ffff880041dee000 ffff8800bcc24100 ffff8800062d67c0 0000000000000001 [46801.721650] ffff88002bffa6f8 ffff880041deffd8 000000000000fba8 ffff88002bffa6f8 [46801.721976] Call Trace: [46801.722121] [<ffffffffa0db2f7d>] osp_sync_fini+0x8d/0x170 [osp] [46801.722304] [<ffffffff8108fd60>] ? autoremove_wake_function+0x0/0x40 [46801.722492] [<ffffffffa0da90fe>] ? osp_disconnect+0x11e/0x170 [osp] [46801.722686] [<ffffffffa0dad05e>] osp_process_config+0x4ae/0x6f0 [osp] [46801.722882] [<ffffffffa0d5e717>] lod_process_config+0x2f7/0xa40 [lod] [46801.723079] [<ffffffffa0a5cc1b>] mdd_process_config+0x20b/0x7f0 [mdd] [46801.723282] [<ffffffffa0c9d5b1>] ? lustre_cfg_new+0x391/0x7e0 [mdt] [46801.723478] [<ffffffffa0c9db71>] mdt_stack_fini+0x171/0xbc0 [mdt] [46801.723665] [<ffffffffa0a59e90>] ? mdd_init_capa_ctxt+0x120/0x130 [mdd] [46801.723865] [<ffffffffa0c9e9ea>] mdt_device_fini+0x42a/0x8e0 [mdt] [46801.724082] [<ffffffffa0546107>] class_cleanup+0x577/0xda0 [obdclass] [46801.724283] [<ffffffffa051c59c>] ? class_name2dev+0x7c/0xe0 [obdclass] [46801.724493] [<ffffffffa05479d5>] class_process_config+0x10a5/0x1c60 [obdclass] [46801.724821] [<ffffffffa0aacec8>] ? libcfs_log_return+0x28/0x40 [libcfs] [46801.725024] [<ffffffffa0541421>] ? lustre_cfg_new+0x391/0x7e0 [obdclass] [46801.725213] [<ffffffffa0548709>] class_manual_cleanup+0x179/0x6e0 [obdclass] [46801.725420] [<ffffffffa051c59c>] ? class_name2dev+0x7c/0xe0 [obdclass] [46801.725626] [<ffffffffa055515c>] server_put_super+0x58c/0x10a0 [obdclass] [46801.725823] [<ffffffff8117d6ab>] generic_shutdown_super+0x5b/0xe0 [46801.726014] [<ffffffff8117d796>] kill_anon_super+0x16/0x60 [46801.726205] [<ffffffffa054a506>] lustre_kill_super+0x36/0x60 [obdclass] [46801.726410] [<ffffffff8117e825>] deactivate_super+0x85/0xa0 [46801.726597] [<ffffffff8119a89f>] mntput_no_expire+0xbf/0x110 [46801.726779] [<ffffffff8119b34b>] sys_umount+0x7b/0x3a0 [46801.726959] [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b The hanging line is cfs_wait_event(thread->t_ctl_waitq, thread->t_flags & SVC_STOPPED); Crashdump is in /exports/crashdumps/t/ospsyn.dmp and modules are in /exports/crashdumps/192.168.10.210-2013-01-18-21:37:33/modules |