[LU-17307] osd_dirent_count() keeps multiple threads busy Created: 21/Nov/23  Updated: 15/Dec/23  Resolved: 15/Dec/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0, Lustre 2.15.0
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Major
Reporter: Andreas Dilger Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-17148 enhance directory migration robustness Open
is related to LU-11025 DNE3: directory restripe Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

When osd_attr_get() is accessing a directory for the first time, it calls osd_dirent_count() to iterate over the directory and fill in obj->oo_dirent_count:

 iterate_dir+0x70/0x140
 osd_ldiskfs_it_fill+0xbd/0x290 [osd_ldiskfs]
 osd_it_ea_next+0xc2/0x150 [osd_ldiskfs]
 osd_attr_get+0x4bc/0x730 [osd_ldiskfs]
 lod_attr_get+0x8b/0x170 [lod]
 mdd_la_get+0x70/0x200 [mdd]
 mdd_attr_get+0x38/0x100 [mdd]
 mdt_attr_get_complex+0x4dd/0x800 [mdt]
 mdt_getattr_internal+0x445/0x1590 [mdt]
 mdt_getattr_name_lock+0x74d/0x2640 [mdt]
 mdt_intent_getattr+0x2a5/0x470 [mdt]
 mdt_intent_opc+0x1ba/0xb40 [mdt]
 mdt_intent_policy+0x1a9/0x370 [mdt]
 ldlm_lock_enqueue+0x3d4/0xb00 [ptlrpc]
 ldlm_handle_enqueue0+0x8b6/0x16d0 [ptlrpc]
 tgt_enqueue+0x62/0x220 [ptlrpc]
 tgt_request_handle+0x8bf/0x18c0 [ptlrpc]
 ptlrpc_server_handle_request+0x253/0xc40 [ptlrpc]
 ptlrpc_main+0xc88/0x26c0 [ptlrpc]
 kthread+0xd1/0xe0

However, there are several issues with this:

  • if the directory is very large (e.g. millions of entries), then this iteration can take a considerable time and blocks the MDS service thread until it is finished.
  • if multiple MDS threads are accessing the same directory, then all of the threads will try to count the number of entries in the directory, blocking all of the threads.
  • the oo_dirent_count value is only needed for auto directory split, which is not enabled today.

The entry counting should only be done if directory auto-split is enabled, avoiding overhead under normal cases. Since mdt_enable_dir_auto_split is controlled at the MDT level, the need for the count should be passed down to the OSD via an LA_DIRENT_COUNT valid flag. For osd-ldiskfs if this flag is not set then the count would be skipped. For osd-zfs it can optionally return the count since it is available for free.

If directory auto-split is enabled, it would be much more efficient to only have one thread do the directory iteration. This should be controlled by a flag in the object, and the other threads can ignore oo_dirent_count (return "0" or "LU_DIRENT_COUNT_UNSET" or the current number of entries found). At worst this would defer the auto split by a few entries, but that doesn't matter in the end since the thread doing the counting will always perform the check itself.



 Comments   
Comment by Gerrit Updater [ 24/Nov/23 ]

"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53229
Subject: LU-17307 mdt: get dirent count by request
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 06aec8f6c410dcb053bc87d61ce334c23e58e1fd

Comment by Xing Huang [ 07/Dec/23 ]

2023-12-07: The fix patch is ready to land to master(temporarily not on master-next branch).

Comment by Gerrit Updater [ 12/Dec/23 ]

"Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/53229/
Subject: LU-17307 mdt: get dirent count by request
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d0babe6bd8b38cb86875c5e3d92aee4c69986ce9

Comment by Xing Huang [ 13/Dec/23 ]

2023-12-13: The fix patch landed to master branch.

Generated at Sat Feb 10 03:34:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.