[LU-5656] umount hang after racer Created: 24/Sep/14  Updated: 04/Feb/16  Resolved: 01/Oct/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Major
Reporter: Oleg Drokin Assignee: Bruno Faccini (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 15854

 Description   

Lately I seem to be having ocasional unmount hangs after running racer.
The umount is stuck trying to clean inode in cl_inode_fini()->cl_object_put_last() where there is an infinite while(1) loop:

[46320.673601] umount        D 0000000000000001  2528  6615   6608 0x00000000
[46320.673601]  ffff880043e8bc88 0000000000000086 0000000000000000 ffffffff00000
00e
[46320.673601]  0000000000000000 ffff88000000000e 0000000000000000 ffff880043e95fb8
[46320.673601]  ffff88008adeaa78 ffff880043e8bfd8 000000000000fbe8 ffff88008adeaa78
[46320.673601] Call Trace:
[46320.673601]  [<ffffffffa0c55e01>] cl_inode_fini+0x1f1/0x270 [lustre]
[46320.673601]  [<ffffffff8105de00>] ? default_wake_function+0x0/0x20
[46320.673601]  [<ffffffffa0c18ad7>] ll_clear_inode+0x247/0x970 [lustre]
[46320.673601]  [<ffffffff811a648a>] clear_inode+0xca/0x160
[46320.673601]  [<ffffffff811a6558>] dispose_list+0x38/0x120
[46320.673601]  [<ffffffff811a6a0e>] invalidate_inodes+0xee/0x180
[46320.673601]  [<ffffffff8118bafc>] generic_shutdown_super+0x4c/0xe0
[46320.673601]  [<ffffffff8118bbf6>] kill_anon_super+0x16/0x60
[46320.673601]  [<ffffffffa0e0684a>] lustre_kill_super+0x4a/0x60 [obdclass]
[46320.673601]  [<ffffffff8118c397>] deactivate_super+0x57/0x80
[46320.673601]  [<ffffffff811ab40f>] mntput_no_expire+0xbf/0x110
[46320.673601]  [<ffffffff811abf7b>] sys_umount+0x7b/0x3a0
[46320.673601]  [<ffffffff8108820d>] ? sigprocmask+0x8d/0x110
[46320.673601]  [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b


 Comments   
Comment by Oleg Drokin [ 01/Oct/14 ]

Fixed by reverting patch 5c4f68be5772b200ee2b728fd121c62ea099d684 from http://review.whamcloud.com/11716

Comment by Bruno Faccini (Inactive) [ 15/Dec/14 ]

Just got the same hang upon Client umount, this was during auto-tests when running replay-single/test_0c for Gerrit-change #12456 patch-set #7. But I have checked and this has occured even with the reversion of #11716 (commit 150246e73c925d628ce9cbbd8184c0b0eefc9a16) by commit 5c4f68be5772b200ee2b728fd121c62ea099d684 ...
Auto-test failure can be found at https://testing.hpdd.intel.com/test_sets/04d38dcc-8418-11e4-8915-5254006e85c2.

Generated at Sat Feb 10 01:53:23 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.