Loading...

XML

Word

Printable

Type: Improvement
Resolution: Done
Priority: Major
Fix Version/s: Lustre 2.18.0
Affects Version/s: Upstream, Lustre 2.16.1
Labels:
- ZFS
- server

Severity:
3
Rank (Obsolete):
9223372036854775807

Lustre directories on ZFS always use the FatZAP format because each directory entry (luz_direntry) stores both the ZFS dnode ID and a Lustre FID, which exceeds the MicroZAP single uint64 value limitation. The FatZAP leaf block size was hardcoded to 14 (16K blocks) in Lustre, due to which even a single empty directory takes a lot of space.

On dRAID or RAIDZ pools, this results in 90-110K dsize per empty directory due to stripe alignment and parity overhead. For a typical MDT with millions of directories, this wastes significant pool space.

I did a minimal test by introducing osd_fzap_blockshift as a module parameter, replacing the hardcoded 14. The parameter is exposed via /sys/module/osd_zfs/parameters/osd_fzap_blockshift.

Before:

# mkdir testdir1 && touch testfile1
# du --si test*
100k    testdir1
1.1k    testfile1

After:

# mkdir testdir1 && touch testfile1
# du --si test*
67k     testdir1
1.1k    testfile1

Testing on the dRAID2:9d:12c:1s pool with different leaf_blockshift values showed the following dsize per empty directory:
blockshift=14 (16K): dsize=~100K (Currentdefault)

Vary FatZAP leaf block size:
blockshift=12 (4K): dsize=~67K
blockshift=13 (8K): dsize=~67K
blockshift=14 (16K): dsize=~100K
blockshift=15 (32K): dsize=~100K

I think blockshift values from 12 to 15 make a lot of sense. The limits are already in place on ZFS, so I'm not explicitly adding any.

Future Work:
The real fix would be to store luz_direntry (dnode + FID), allowing small directories to avoid FatZAP entirely. This TinyZAP implementation would require changes on both the ZFS and Lustre sides. This would reduce empty directory dsize to a very low value by storing entries in the existing dnode bonus buffer with no additional block allocation.

Assignee:: Akash B

Reporter:: Akash B

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 17/Mar/26 4:00 PM

Updated:: 21/May/26 9:53 AM

Resolved:: 21/May/26 9:53 AM

Details

Description

Attachments

Activity

People

Dates