Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
Lustre 2.4.0
-
None
-
3
-
6195
Description
I seem to have a somewhat frequent hang on lustre cleanup that looks like this:
[46801.719874] INFO: task umount:16892 blocked for more than 120 seconds. [46801.720068] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [46801.720819] umount D 0000000000000000 2608 16892 16891 0x00000000 [46801.721019] ffff880041defa18 0000000000000086 ffff880041def9e0 ffff880041def9dc [46801.721325] ffff880041dee000 ffff8800bcc24100 ffff8800062d67c0 0000000000000001 [46801.721650] ffff88002bffa6f8 ffff880041deffd8 000000000000fba8 ffff88002bffa6f8 [46801.721976] Call Trace: [46801.722121] [<ffffffffa0db2f7d>] osp_sync_fini+0x8d/0x170 [osp] [46801.722304] [<ffffffff8108fd60>] ? autoremove_wake_function+0x0/0x40 [46801.722492] [<ffffffffa0da90fe>] ? osp_disconnect+0x11e/0x170 [osp] [46801.722686] [<ffffffffa0dad05e>] osp_process_config+0x4ae/0x6f0 [osp] [46801.722882] [<ffffffffa0d5e717>] lod_process_config+0x2f7/0xa40 [lod] [46801.723079] [<ffffffffa0a5cc1b>] mdd_process_config+0x20b/0x7f0 [mdd] [46801.723282] [<ffffffffa0c9d5b1>] ? lustre_cfg_new+0x391/0x7e0 [mdt] [46801.723478] [<ffffffffa0c9db71>] mdt_stack_fini+0x171/0xbc0 [mdt] [46801.723665] [<ffffffffa0a59e90>] ? mdd_init_capa_ctxt+0x120/0x130 [mdd] [46801.723865] [<ffffffffa0c9e9ea>] mdt_device_fini+0x42a/0x8e0 [mdt] [46801.724082] [<ffffffffa0546107>] class_cleanup+0x577/0xda0 [obdclass] [46801.724283] [<ffffffffa051c59c>] ? class_name2dev+0x7c/0xe0 [obdclass] [46801.724493] [<ffffffffa05479d5>] class_process_config+0x10a5/0x1c60 [obdclass] [46801.724821] [<ffffffffa0aacec8>] ? libcfs_log_return+0x28/0x40 [libcfs] [46801.725024] [<ffffffffa0541421>] ? lustre_cfg_new+0x391/0x7e0 [obdclass] [46801.725213] [<ffffffffa0548709>] class_manual_cleanup+0x179/0x6e0 [obdclass] [46801.725420] [<ffffffffa051c59c>] ? class_name2dev+0x7c/0xe0 [obdclass] [46801.725626] [<ffffffffa055515c>] server_put_super+0x58c/0x10a0 [obdclass] [46801.725823] [<ffffffff8117d6ab>] generic_shutdown_super+0x5b/0xe0 [46801.726014] [<ffffffff8117d796>] kill_anon_super+0x16/0x60 [46801.726205] [<ffffffffa054a506>] lustre_kill_super+0x36/0x60 [obdclass] [46801.726410] [<ffffffff8117e825>] deactivate_super+0x85/0xa0 [46801.726597] [<ffffffff8119a89f>] mntput_no_expire+0xbf/0x110 [46801.726779] [<ffffffff8119b34b>] sys_umount+0x7b/0x3a0 [46801.726959] [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
The hanging line is
cfs_wait_event(thread->t_ctl_waitq, thread->t_flags & SVC_STOPPED);
Crashdump is in /exports/crashdumps/t/ospsyn.dmp and modules are in /exports/crashdumps/192.168.10.210-2013-01-18-21:37:33/modules