Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
None
-
Lustre 2.5.3, Lustre 2.5.4
-
Lustre Build: https://build.hpdd.intel.com/job/lustre-b2_5/84/
Distro/Arch: RHEL6.5/x86_64
FSTYPE=ldiskfs
-
3
-
15416
Description
While testing patch http://review.whamcloud.com/11539 based on Lustre b2_5 build #84, unmounting mgs in sanity-lfsck test 0 hung:
20:00:58:Lustre: DEBUG MARKER: umount -d -f /mnt/mds1 20:00:58:LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.1.4.57@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. 20:00:58:LustreError: Skipped 7 previous similar messages 20:00:58:LustreError: 166-1: MGC10.1.4.66@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail 20:00:58:Lustre: MGS is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 5. Is it stuck? 20:00:58:Lustre: MGS is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 5. Is it stuck? 20:00:58:Lustre: 20326:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1408672403/real 1408672403] req@ffff88007b36dc00 x1477087144042844/t0(0) o250->MGC10.1.4.66@tcp@0@lo:26/25 lens 400/544 e 0 to 1 dl 1408672419 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 20:00:58:Lustre: 20326:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 11 previous similar messages 20:00:58:Lustre: MGS is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 5. Is it stuck? 20:00:58:Lustre: MGS is waiting for obd_unlinked_exports more than 64 seconds. The obd refcount = 5. Is it stuck? 20:00:58:LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.1.4.57@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. 20:00:58:LustreError: Skipped 213 previous similar messages 20:00:58:INFO: task umount:16206 blocked for more than 120 seconds. 20:00:58: Not tainted 2.6.32-431.23.3.el6_lustre.g6035153.x86_64 #1 20:00:58:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 20:00:58:umount D 0000000000000001 0 16206 16205 0x00000080 20:00:58: ffff880059079aa8 0000000000000082 0000000000000000 ffff88007b874400 20:00:58: ffffffffa0c34294 0000000000000000 ffff88006c2120c4 ffffffffa0c34294 20:00:58: ffff880060233af8 ffff880059079fd8 000000000000fbc8 ffff880060233af8 20:00:58:Call Trace: 20:00:58: [<ffffffff81529e92>] schedule_timeout+0x192/0x2e0 20:00:58: [<ffffffff81083f30>] ? process_timeout+0x0/0x10 20:00:58: [<ffffffffa0bb5e9b>] obd_exports_barrier+0xab/0x180 [obdclass] 20:00:58: [<ffffffffa16e152e>] mgs_device_fini+0xfe/0x580 [mgs] 20:00:58: [<ffffffffa0be19f3>] class_cleanup+0x573/0xd30 [obdclass] 20:00:58: [<ffffffffa0bb8036>] ? class_name2dev+0x56/0xe0 [obdclass] 20:00:58: [<ffffffffa0be371a>] class_process_config+0x156a/0x1ad0 [obdclass] 20:00:58: [<ffffffffa0bdc873>] ? lustre_cfg_new+0x2d3/0x6e0 [obdclass] 20:00:58: [<ffffffffa0be3df9>] class_manual_cleanup+0x179/0x6f0 [obdclass] 20:00:58: [<ffffffffa0bb8036>] ? class_name2dev+0x56/0xe0 [obdclass] 20:00:58: [<ffffffffa0c1f2dd>] server_put_super+0x45d/0xf60 [obdclass] 20:00:58: [<ffffffff8118b23b>] generic_shutdown_super+0x5b/0xe0 20:00:58: [<ffffffff8118b326>] kill_anon_super+0x16/0x60 20:00:58: [<ffffffffa0be5ca6>] lustre_kill_super+0x36/0x60 [obdclass] 20:00:58: [<ffffffff8118bac7>] deactivate_super+0x57/0x80 20:00:58: [<ffffffff811ab4cf>] mntput_no_expire+0xbf/0x110 20:00:58: [<ffffffff811ac01b>] sys_umount+0x7b/0x3a0 20:00:58: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b 20:00:58:Lustre: MGS is waiting for obd_unlinked_exports more than 128 seconds. The obd refcount = 5. Is it stuck?
Maloo report: https://testing.hpdd.intel.com/test_sets/37948628-29b7-11e4-8657-5254006e85c2