[LU-3632] insanity 0 hung when unmounting an OST Created: 25/Jul/13  Updated: 05/Aug/13  Resolved: 05/Aug/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: zfs

Issue Links:
Duplicate
duplicates LU-3230 conf-sanity fails to start run: umoun... Resolved
Severity: 3
Rank (Obsolete): 9354

 Description   

This issue was created by maloo for Li Wei <liwei@whamcloud.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/ceba5fc4-f46f-11e2-b8a2-52540035b04c.

The sub-test test_0 failed with the following error:

test failed to respond and timed out

Info required for matching: insanity 0



 Comments   
Comment by Li Wei (Inactive) [ 25/Jul/13 ]

From the OSS console:

23:42:46:Lustre: DEBUG MARKER: umount -d /mnt/ost3
23:42:46:Lustre: Failing over lustre-OST0002
23:42:46:Lustre: Skipped 2 previous similar messages
23:42:46:Lustre: lustre-OST0002: Not available for connect from 10.10.16.107@tcp (stopping)
23:42:46:Lustre: Skipped 2 previous similar messages
23:42:46:Lustre: lustre-OST0002 is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 5. Is it stuck?
23:42:46:Lustre: lustre-OST0002 is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 5. Is it stuck?
23:42:46:Lustre: lustre-OST0002 is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 5. Is it stuck?
23:42:46:Lustre: lustre-OST0002 is waiting for obd_unlinked_exports more than 64 seconds. The obd refcount = 5. Is it stuck?
23:42:46:Lustre: lustre-OST0002 is waiting for obd_unlinked_exports more than 128 seconds. The obd refcount = 5. Is it stuck?
23:42:46:INFO: task umount:6586 blocked for more than 120 seconds.
23:42:46:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
23:42:46:umount D 0000000000000000 0 6586 6585 0x00000080
23:42:46: ffff880047e09aa8 0000000000000082 ffffffff00000010 ffff880047e09a58
23:42:46: ffff880047e09a18 ffff88006cd5ec00 ffffffffa078e717 0000000000000000
23:42:46: ffff88007c601098 ffff880047e09fd8 000000000000fb88 ffff88007c601098
23:42:46:Call Trace:
23:42:46: [<ffffffff8150ee42>] schedule_timeout+0x192/0x2e0
23:42:46: [<ffffffff810810e0>] ? process_timeout+0x0/0x10
23:42:46: [<ffffffffa05d662d>] cfs_schedule_timeout_and_set_state+0x1d/0x20 [libcfs]
23:42:46: [<ffffffffa070f548>] obd_exports_barrier+0x98/0x170 [obdclass]
23:42:46: [<ffffffffa0e42962>] ofd_device_fini+0x42/0x230 [ofd]
23:42:46: [<ffffffffa073ae67>] class_cleanup+0x577/0xda0 [obdclass]
23:42:46: [<ffffffffa07116f6>] ? class_name2dev+0x56/0xe0 [obdclass]
23:42:46: [<ffffffffa073c74c>] class_process_config+0x10bc/0x1c80 [obdclass]
23:42:46: [<ffffffffa0736133>] ? lustre_cfg_new+0x2d3/0x6e0 [obdclass]
23:42:46: [<ffffffffa073d489>] class_manual_cleanup+0x179/0x6f0 [obdclass]
23:42:46: [<ffffffffa07116f6>] ? class_name2dev+0x56/0xe0 [obdclass]
23:42:46: [<ffffffffa077893c>] server_put_super+0x5ec/0xf60 [obdclass]
23:50:45: [<ffffffff811833ab>] generic_shutdown_super+0x5b/0xe0
23:50:45: [<ffffffff81183496>] kill_anon_super+0x16/0x60
23:50:45: [<ffffffffa073f336>] lustre_kill_super+0x36/0x60 [obdclass]
23:50:45: [<ffffffff81183c37>] deactivate_super+0x57/0x80
23:50:45: [<ffffffff811a1c8f>] mntput_no_expire+0xbf/0x110
23:50:45: [<ffffffff811a26fb>] sys_umount+0x7b/0x3a0
23:50:45: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
23:50:45:Lustre: lustre-OST0002 is waiting for obd_unlinked_exports more than 256 seconds. The obd refcount = 5. Is it stuck?
23:50:45:Lustre: lustre-OST0002: Not available for connect from 10.10.16.107@tcp (stopping)
23:50:45:Lustre: Skipped 308 previous similar messages

Comment by Nathaniel Clark [ 05/Aug/13 ]

I believe this is a duplicate of LU-3230

Generated at Sat Feb 10 01:35:34 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.