Details
-
Bug
-
Resolution: Duplicate
-
Minor
-
None
-
Lustre 2.6.0
-
3
-
12906
Description
This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>
This issue relates to the following test suite run:
http://maloo.whamcloud.com/test_sets/95df0edc-a132-11e3-ab55-52540035b04c.
suite_log:
CMD: wtm-15vm4 umount -d -f /mnt/ost1 CMD: wtm-15vm4 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: wtm-15vm4 ! zpool list -H lustre-ost1 >/dev/null 2>&1 || grep -q ^lustre-ost1/ /proc/mounts || zpool export lustre-ost1 CMD: wtm-15vm4 grep -c /mnt/ost2' ' /proc/mounts Stopping /mnt/ost2 (opts:-f) on wtm-15vm4 CMD: wtm-15vm4 umount -d -f /mnt/ost2
console log of OST1:
Lustre: DEBUG MARKER: == recovery-small test complete, duration 2715 sec == 20:15:02 (1393647302) Lustre: lustre-OST0000: deleting orphan objects from 0x0:5251 to 0x0:5377 Lustre: lustre-OST0001: deleting orphan objects from 0x0:5123 to 0x0:5153 Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 5 sec Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 5 sec Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 5 sec Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 5 sec Lustre: DEBUG MARKER: grep -c /mnt/ost1' ' /proc/mounts Lustre: DEBUG MARKER: umount -d -f /mnt/ost1 Lustre: server umount lustre-OST0000 complete Lustre: Skipped 2 previous similar messages Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' Lustre: DEBUG MARKER: ! zpool list -H lustre-ost1 >/dev/null 2>&1 || grep -q ^lustre-ost1/ /proc/mounts || zpool export lustre-ost1 Lustre: DEBUG MARKER: grep -c /mnt/ost2' ' /proc/mounts Lustre: DEBUG MARKER: umount -d -f /mnt/ost2 Lustre: lustre-OST0001 is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 5. Is it stuck? LustreError: 166-1: MGC10.10.16.161@tcp: Connection to MGS (at 10.10.16.161@tcp) was lost; in progress operations using this service will fail LustreError: Skipped 2 previous similar messages Lustre: lustre-OST0001 is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 5. Is it stuck? Lustre: lustre-OST0001 is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 5. Is it stuck? Lustre: lustre-OST0001 is waiting for obd_unlinked_exports more than 64 seconds. The obd refcount = 5. Is it stuck? Lustre: lustre-OST0001 is waiting for obd_unlinked_exports more than 128 seconds. The obd refcount = 5. Is it stuck? INFO: task umount:26349 blocked for more than 120 seconds. Tainted: P --------------- 2.6.32-431.5.1.el6_lustre.g1131719.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. umount D 0000000000000001 0 26349 26348 0x00000080 ffff88001bebdaa8 0000000000000086 ffff88001bebda08 ffff8800118be000 ffffffffa13a5a09 0000000000000000 ffff880010c85084 ffffffffa13a5a09 ffff88002b5545f8 ffff88001bebdfd8 000000000000fbc8 ffff88002b5545f8 Call Trace: [<ffffffff81528eb2>] schedule_timeout+0x192/0x2e0 [<ffffffff81084220>] ? process_timeout+0x0/0x10 [<ffffffffa132a29b>] obd_exports_barrier+0xab/0x180 [obdclass] [<ffffffffa187f98b>] ofd_device_fini+0x6b/0x250 [ofd] [<ffffffffa1353a03>] class_cleanup+0x573/0xd30 [obdclass] [<ffffffffa132c436>] ? class_name2dev+0x56/0xe0 [obdclass] [<ffffffffa135572a>] class_process_config+0x156a/0x1ad0 [obdclass] [<ffffffffa07bf4e8>] ? libcfs_log_return+0x28/0x40 [libcfs] [<ffffffffa134da42>] ? lustre_cfg_new+0x312/0x6e0 [obdclass] [<ffffffffa1355e09>] class_manual_cleanup+0x179/0x6f0 [obdclass] [<ffffffffa132c436>] ? class_name2dev+0x56/0xe0 [obdclass] [<ffffffffa138f1f9>] server_put_super+0x8e9/0xe40 [obdclass] [<ffffffff8118b23b>] generic_shutdown_super+0x5b/0xe0 [<ffffffff8118b326>] kill_anon_super+0x16/0x60 [<ffffffffa1357cc6>] lustre_kill_super+0x36/0x60 [obdclass] [<ffffffff8118bac7>] deactivate_super+0x57/0x80 [<ffffffff811aaaff>] mntput_no_expire+0xbf/0x110 [<ffffffff811ab64b>] sys_umount+0x7b/0x3a0 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Lustre: 12217:0:(client.c:1912:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1393647833/real 1393647833] req@ffff880038f04800 x1461340392671056/t0(0) o250->MGC10.10.16.161@tcp@10.10.16.161@tcp:26/25 lens 400/544 e 0 to 1 dl 1393647858 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: 12217:0:(client.c:1912:ptlrpc_expire_one_request()) Skipped 20 previous similar messages Lustre: lustre-OST0001 is waiting for obd_unlinked_exports more than 256 seconds. The obd refcount = 5. Is it stuck?