[LU-11546] enable large_dir support for MDTs Created: 18/Oct/18  Updated: 25/May/22  Resolved: 12/Nov/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0, Lustre 2.13.0
Fix Version/s: Lustre 2.14.0, Lustre 2.12.8

Type: Bug Priority: Minor
Reporter: Andreas Dilger Assignee: Dongyang Li
Resolution: Fixed Votes: 0
Labels: LTS12

Issue Links:
Related
is related to LU-11915 conf-sanity test 115 is skipped or hangs Open
is related to LU-1365 Implement ldiskfs LARGEDIR support fo... Resolved
is related to LU-11440 Make e2fsprogs-1.44.3-wc1 release Resolved
is related to LU-11912 reduce number of OST objects created ... Resolved
is related to LU-6824 improve error messages for dir htree ... Resolved
is related to LU-12892 Large directory feature is not enable... Resolved
is related to LU-14345 e2fsck of very large directories is b... Resolved
is related to LU-14734 enable large_dir on existing MDTs Resolved
is related to LU-12406 conf-sanity test 111 fails with ‘add ... Resolved
is related to LU-10329 DNE3: REMOTE_PARENT_DIR scalability Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Now that e2fsprogs-1.44.3 has support for large_dir, testing and enabling the large_dir support on MDTs would allow single directories to exceed the ~10M limit currently imposed by the 2-level htree. This should be done automatically for new MDTs, and the existing error message in ext4_dx_add_entry() should be updated to directly reference the large_dir feature by name instead of just "Large directory", and explain that it should be enabled by tune2fs.



 Comments   
Comment by Andreas Dilger [ 18/Oct/18 ]

Note that I don't think that large_dir should be used for OSTs. For very large OSTs that exceed the 10M-entry limit for the O/0/d* directories, I think it makes more sense to have the MDTs create fewer than LUSTRE_DATA_SEQ_MAX_WIDTH (= 4B) objects per OST sequence, and have the OSTs create new object directories O/<seq>/d* for each sequence (which they already do for DNE when multiple MDTs are creating objects on the OST). This will allow the older object directory blocks to drop out of RAM as they become less used, and eventually those directories could be removed when they become empty.

Having a single huge directory for objects means that the directory leaf blocks are updated totally randomly, and must always fit into RAM, or cause high read/write IOPS to the OST storage when there are lots of objects on a single OST. That is very undesirable, since it will typically be HDD-based OSTs that are so large they need more than 320M objects in a single filesystem (10M entries/directory * 32 directories).

Comment by Andreas Dilger [ 27/Jun/19 ]

The "optimizing for more than 320M objects per OST per MDT" issue is being tracked in LU-11912. This ticket is only for testing and enabling large_dir by default on filesystems.

Comment by Gerrit Updater [ 28/Jun/19 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35358
Subject: LU-11546 tests: enable large_dir support for tests
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2d2209f33d9092e2de0a57f55c46950f5fee0fd5

Comment by Gerrit Updater [ 07/Sep/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35358/
Subject: LU-11546 tests: enable large_dir support for tests
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 5a02d431f4a0a80915afa19c065df29c61e26ec9

Comment by Andreas Dilger [ 22/Oct/19 ]

Dongyang, can you please make a patch to enable large_dir on MDTs when they are formatted by mkfs.lustre. This can go into master once 2.14 opens (in the next few weeks), and then likely backported to 2.12.4.

Comment by Gerrit Updater [ 23/Oct/19 ]

Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/36555
Subject: LU-11546 utils: enable large_dir for ldiskfs
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4c703152e418707a2ec2be607c07329d80b19e80

Comment by Gerrit Updater [ 12/Nov/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36555/
Subject: LU-11546 utils: enable large_dir for ldiskfs
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: cd1faa0124f21e12a5ecd83c709c13918264fc86

Comment by Peter Jones [ 12/Nov/19 ]

Landed for 2.14

Comment by Gerrit Updater [ 18/Nov/19 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36780
Subject: LU-11546 tests: enable large_dir support for tests
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 8059fd20860c57eaa224b9d30219000e3127d586

Comment by Gerrit Updater [ 18/Nov/19 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36781
Subject: LU-11546 utils: enable large_dir for ldiskfs
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 823fc815fe3c5e3a7e2dd8ae8f8aa4d94301c35f

Comment by Stephane Thiell [ 04/Jun/20 ]

It would be nice to have this patch landed into 2.12 at some point. We just used 2.12.5 RC1 to format a MDT and large_dir was not set. With DNE and especially if we use lfs migrate -m, large_dir becomes quickly mandatory on MDTs.

Comment by Andreas Dilger [ 04/Jun/20 ]

Stephane, it is easy enough to set after formatting - "tune2fs -O large_dir <dev>". The holdup with landing the patch is that the tests written for this feature don't pass. That is mostly a problem with the tests themselves (they don't pass on master either), so either the change should be rebased to not depend on the tests, or the tests should be fixed.

Comment by Stephane Thiell [ 04/Jun/20 ]

OK! We actually used mkfs.lustre -O large_dir,project and it worked fine. It's also easy to set using tune2fs like you said so not super critical to have it by default indeed. Thanks for the heads-up regarding the tests. Note that we have been using large_dir for months now on Fir's MDTs (2.12.x).

Comment by Gerrit Updater [ 18/Oct/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/36781/
Subject: LU-11546 utils: enable large_dir for ldiskfs
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: 638638e3481f6dd3f6830e8e272362d8922c52f7

Generated at Sat Feb 10 02:44:49 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.