[LU-5039] MDS mount hangs on orphan recovery - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.6.0, Lustre 2.5.3
Affects Version/s: None
Labels:
- mn4

Severity:
3
Rank (Obsolete):
13931

Description

Running Lustre 2.4.0-28chaos (see github.com/chaos/lustre), we find that sometimes after a reboot the MDS can get stuck during mount cleaning up the orphan files in the PENDING directory. Some times we have 100,000+ files to process, and this can take literally hours. The symptoms are pretty similar to ~~LU-5038~~, but I believe that the cause is different.

Here is a backtrace of the offending thread:

2014-03-06 22:34:12 Process tgt_recov (pid: 15478, threadinfo ffff8807bc436000, task ffff88081a6e2080)
2014-03-06 22:34:12 Stack:
2014-03-06 22:34:12  ffff88072e3df000 0000000000000000 0000000000003f14 ffff88072e3df060
2014-03-06 22:34:12 <d> ffff8807bc437a40 ffffffffa0341396 ffff8807bc437a20 ffff88072e3df038
2014-03-06 22:34:12 <d> 0000000000000014 ffff8807f9fbf530 0000000000000000 0000000000003f14
2014-03-06 22:34:12 Call Trace:
2014-03-06 22:34:12  [<ffffffffa0341396>] __dbuf_hold_impl+0x66/0x480 [zfs]
2014-03-06 22:34:12  [<ffffffffa034182f>] dbuf_hold_impl+0x7f/0xb0 [zfs]
2014-03-06 22:34:12  [<ffffffffa03428e0>] dbuf_hold+0x20/0x30 [zfs]
2014-03-06 22:34:12  [<ffffffffa03486e7>] dmu_buf_hold+0x97/0x1d0 [zfs]
2014-03-06 22:34:12  [<ffffffffa03369a0>] ? remove_reference+0xa0/0xc0 [zfs]
2014-03-06 22:34:12  [<ffffffffa039e76b>] zap_idx_to_blk+0xab/0x140 [zfs]
2014-03-06 22:34:12  [<ffffffffa039ff61>] zap_deref_leaf+0x51/0x80 [zfs]
2014-03-06 22:34:12  [<ffffffffa039f956>] ? zap_put_leaf+0x86/0xe0 [zfs]
2014-03-06 22:34:12  [<ffffffffa03a03dc>] fzap_cursor_retrieve+0xfc/0x2a0 [zfs]
2014-03-06 22:34:12  [<ffffffffa03a593b>] zap_cursor_retrieve+0x17b/0x2f0 [zfs]
2014-03-06 22:34:12  [<ffffffffa0d1739c>] ? udmu_zap_cursor_init_serialized+0x2c/0x30 [osd_zfs]
2014-03-06 22:34:12  [<ffffffffa0d29058>] osd_index_retrieve_skip_dots+0x28/0x60 [osd_zfs]
2014-03-06 22:34:12  [<ffffffffa0d29638>] osd_dir_it_next+0x98/0x120 [osd_zfs]
2014-03-06 22:34:12  [<ffffffffa0f08161>] lod_it_next+0x21/0x90 [lod]
2014-03-06 22:34:12  [<ffffffffa0dd1989>] __mdd_orphan_cleanup+0xa9/0xca0 [mdd]
2014-03-06 22:34:12  [<ffffffffa0de134d>] mdd_recovery_complete+0xed/0x170 [mdd]
2014-03-06 22:34:12  [<ffffffffa0e34cb5>] mdt_postrecov+0x35/0xd0 [mdt]
2014-03-06 22:34:12  [<ffffffffa0e36178>] mdt_obd_postrecov+0x78/0x90 [mdt]
2014-03-06 22:34:12  [<ffffffffa08745c0>] ? ldlm_reprocess_res+0x0/0x20 [ptlrpc]
2014-03-06 22:34:12  [<ffffffffa086f8ae>] ? ldlm_reprocess_all_ns+0x3e/0x110 [ptlrpc]
2014-03-06 22:34:12  [<ffffffffa0885004>] target_recovery_thread+0xc64/0x1980 [ptlrpc]
2014-03-06 22:34:12  [<ffffffffa08843a0>] ? target_recovery_thread+0x0/0x1980 [ptlrpc]
2014-03-06 22:34:12  [<ffffffff8100c10a>] child_rip+0xa/0x20
2014-03-06 22:34:12  [<ffffffffa08843a0>] ? target_recovery_thread+0x0/0x1980 [ptlrpc]
2014-03-06 22:34:12  [<ffffffffa08843a0>] ? target_recovery_thread+0x0/0x1980 [ptlrpc]
2014-03-06 22:34:12  [<ffffffff8100c100>] ? child_rip+0x0/0x20

The mount process is blocked while this is going on. The cleanup is completely sequential and on ZFS very slow, on the order of 10 per second.

The orphan cleanup task really needs to be backgrounded (and perhaps parallelized) rather than blocking the MDT mount processes.

Attachments

Activity

People

Assignee:: Niu Yawei (Inactive)

Reporter:: Christopher Morrone (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 09/May/14 9:11 PM

Updated:: 12/Aug/14 7:43 PM

Resolved:: 23/Jun/14 9:42 PM