[LU-16972] optimize e2fsck ea_refcount processing Created: 20/Jul/23  Updated: 09/Aug/23  Resolved: 09/Aug/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: Dongyang Li
Resolution: Fixed Votes: 0
Labels: e2fsck
Environment:

Exascaler version: 5.2.5


Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Filesystems with a large number of external xattr blocks, such as MDTs with large PFL layouts that overflow the in-inode xattr space, cause a significant slowdown during pass1 inode processing because of inefficient management of the refcount_extra list that stores the refcounts for shared external xattr blocks.

The refcount_extra structure is implemented as a linear array of ea_refcount_el elements:

struct ea_refcount_el { 
        /* ea_key could either be an inode number or block number. */
        ea_key_t        ea_key;
        ea_value_t      ea_value;
}; 

During element insertion, a binary search is used in the array to find the element, and if not found, the higher elements in the array are memmove()'d upward to make room for the new element in the list. If the currently allocated array is full, then it is realloc()'d with 100 more elements, and the whole array is copied to the new memory.

If there are a large number of shared xattr blocks then the array can grow very large and the processing of the linear array will be very inefficient and CPU intensive due to frequent large memory allocations and copies.



 Comments   
Comment by Gerrit Updater [ 21/Jul/23 ]

"Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51729
Subject: LU-16972 e2fsck: only add xattr block if refcount > 1
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set: 1
Commit: ca038661f5b332f4a0475eaae8bcec039b88b195

Comment by Andreas Dilger [ 21/Jul/23 ]

I was wondering whether this code could use the "icount" structure used elsewhere in e2fsck (eg. duplicate block handling). This is optimized for a count of 0/1 in a bitmap, and puts entries into an array only for count > 1. The drawback is that it has to allocate a full bitmap of entries for all blocks counted, even though the usage would be very sparse.

For an 10TiB MDT filesystem, the icount bitmap would be 2.5B bits, or 320MiB of RAM per icount. The current ea_refcount struct is 2x__u64=16 bytes per element, which means the break-even point would be 20M xattr blocks, which is quite high for most filesystems.

I think the better alternative is to expand on what is in the current patch, namely that the block bitmap is used to track xattr blocks with count = 1 and then only add the ea_refcount for xattr blocks with refcount > 1. The one complexity is whether this will catch xattr blocks that are duplicated with regular file blocks, but I think it would be handled correctly anyway.

Comment by Gerrit Updater [ 26/Jul/23 ]

"Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51763
Subject: LU-16972 e2fsck: fix merging ea_inode_refs
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set: 1
Commit: fa29ce6fce78f90c691f5413393175b47ca5f566

Comment by Gerrit Updater [ 01/Aug/23 ]

"Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51827
Subject: LU-16972 tests: createmany sets xattr
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b2157c6920b54e63c3aff6eae85873118482f8eb

Comment by Gerrit Updater [ 07/Aug/23 ]

"Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51763/
Subject: LU-16972 e2fsck: fix merging ea_inode_refs
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set:
Commit: 35ecc72a445b04f2a257bc31cbeb310d76420b89

Comment by Gerrit Updater [ 07/Aug/23 ]

"Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51729/
Subject: LU-16972 e2fsck: use rb-tree to track EA reference counts
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set:
Commit: 165abd0095613b60fbf64d139b899ba4896a2d92

Comment by Gerrit Updater [ 07/Aug/23 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51885
Subject: LU-16972 build: update version to 1.47.0-wc4
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set: 1
Commit: ac0905b216ba46ffbce4dffa94c1576d06d00096

Comment by Gerrit Updater [ 07/Aug/23 ]

"Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51885/
Subject: LU-16972 build: update version to 1.47.0-wc4
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set:
Commit: cc6a3d0d898cc06dc87780305f6427d6dec5e120

Comment by Andreas Dilger [ 09/Aug/23 ]

Fix included in 1.47.0-wc4

Generated at Sat Feb 10 03:31:32 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.