[LU-4062] sanity test_132: MGS is waiting for obd_unlinked_exports more than 512 seconds Created: 04/Oct/13  Updated: 09/Jan/15  Resolved: 22/Apr/14

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-5539 MGS is waiting for obd_unlinked_expor... Resolved
is duplicated by LU-4772 MGS is waiting for obd_unlinked_exports Resolved
is duplicated by LU-3665 obdfilter-survey test_3a: unmount stu... Resolved
is duplicated by LU-4449 Test failure timeout on sanity-scrub ... Resolved
Related
is related to LU-3230 conf-sanity fails to start run: umoun... Resolved
Severity: 3
Rank (Obsolete): 10886

 Description   

This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/563797fe-2ca7-11e3-b068-52540035b04c.

Suspect this may be a dup of LU-4019, but I'm not sure. In any case the failure happens before the only change, a new subtest in sanity.sh. This makes it pretty certain that this test failure is unrelated to the change under test.

The sub-test test_132 failed with the following error:

test failed to respond and timed out

Info required for matching: sanity 132



 Comments   
Comment by Andreas Dilger [ 04/Oct/13 ]
21:41:16:INFO: task umount:21382 blocked for more than 120 seconds.
21:41:17:umount        D 0000000000000000     0 21382  21381 0x00000080
21:41:18:Call Trace:
21:41:18: [<ffffffff8150f362>] schedule_timeout+0x192/0x2e0
21:41:18: [<ffffffffa05a9efb>] obd_exports_barrier+0xab/0x180 [obdclass]
21:41:18: [<ffffffffa0d1952e>] mgs_device_fini+0xfe/0x580 [mgs]
21:41:18: [<ffffffffa05d54d3>] class_cleanup+0x573/0xd30 [obdclass]
21:41:19: [<ffffffffa05d71fa>] class_process_config+0x156a/0x1ad0 [obdclass]
21:41:19: [<ffffffffa05d78d9>] class_manual_cleanup+0x179/0x6f0 [obdclass]
21:41:20: [<ffffffffa0612ded>] server_put_super+0x45d/0xf60 [obdclass]
21:41:21: [<ffffffff8118363b>] generic_shutdown_super+0x5b/0xe0
21:41:21: [<ffffffff81183726>] kill_anon_super+0x16/0x60
21:41:21: [<ffffffffa05d9786>] lustre_kill_super+0x36/0x60 [obdclass]
21:41:21: [<ffffffff81183ec7>] deactivate_super+0x57/0x80
21:41:21: [<ffffffff811a21bf>] mntput_no_expire+0xbf/0x110
21:41:22: [<ffffffff811a2c2b>] sys_umount+0x7b/0x3a0
Comment by Oleg Drokin [ 06/Oct/13 ]

I happened to hit this too today.
I made a crashdump just in case, though any useful lustre logs are probably long lost due to me noticing the condition many hours after it happened.

/export/crashdumps/lu4062/ source tag master-20131005

Comment by Nathaniel Clark [ 13/Jan/14 ]

This also seems to happen on conf-sanity test_57a (review-dne)
https://maloo.whamcloud.com/test_sets/3f1a10ba-7b3a-11e3-a11f-52540035b04c

Comment by Sarah Liu [ 11/Feb/14 ]

Also hit this in SLES11 SP3 client

https://maloo.whamcloud.com/test_sets/cc2573e4-90cc-11e3-91ee-52540035b04c

Generated at Sat Feb 10 01:39:19 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.