Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16972

optimize e2fsck ea_refcount processing

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • Exascaler version: 5.2.5
    • 3
    • 9223372036854775807

    Description

      Filesystems with a large number of external xattr blocks, such as MDTs with large PFL layouts that overflow the in-inode xattr space, cause a significant slowdown during pass1 inode processing because of inefficient management of the refcount_extra list that stores the refcounts for shared external xattr blocks.

      The refcount_extra structure is implemented as a linear array of ea_refcount_el elements:

      struct ea_refcount_el { 
              /* ea_key could either be an inode number or block number. */
              ea_key_t        ea_key;
              ea_value_t      ea_value;
      }; 
      

      During element insertion, a binary search is used in the array to find the element, and if not found, the higher elements in the array are memmove()'d upward to make room for the new element in the list. If the currently allocated array is full, then it is realloc()'d with 100 more elements, and the whole array is copied to the new memory.

      If there are a large number of shared xattr blocks then the array can grow very large and the processing of the linear array will be very inefficient and CPU intensive due to frequent large memory allocations and copies.

      Attachments

        Activity

          [LU-16972] optimize e2fsck ea_refcount processing

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51827/
          Subject: LU-16972 tests: createmany sets xattr
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 0c1a65c0422f8322f52a3ada6c1296653c20cbcc

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51827/ Subject: LU-16972 tests: createmany sets xattr Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0c1a65c0422f8322f52a3ada6c1296653c20cbcc

          Fix included in 1.47.0-wc4

          adilger Andreas Dilger added a comment - Fix included in 1.47.0-wc4

          "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51885/
          Subject: LU-16972 build: update version to 1.47.0-wc4
          Project: tools/e2fsprogs
          Branch: master-lustre
          Current Patch Set:
          Commit: cc6a3d0d898cc06dc87780305f6427d6dec5e120

          gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51885/ Subject: LU-16972 build: update version to 1.47.0-wc4 Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: Commit: cc6a3d0d898cc06dc87780305f6427d6dec5e120

          "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51885
          Subject: LU-16972 build: update version to 1.47.0-wc4
          Project: tools/e2fsprogs
          Branch: master-lustre
          Current Patch Set: 1
          Commit: ac0905b216ba46ffbce4dffa94c1576d06d00096

          gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51885 Subject: LU-16972 build: update version to 1.47.0-wc4 Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: 1 Commit: ac0905b216ba46ffbce4dffa94c1576d06d00096

          "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51729/
          Subject: LU-16972 e2fsck: use rb-tree to track EA reference counts
          Project: tools/e2fsprogs
          Branch: master-lustre
          Current Patch Set:
          Commit: 165abd0095613b60fbf64d139b899ba4896a2d92

          gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51729/ Subject: LU-16972 e2fsck: use rb-tree to track EA reference counts Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: Commit: 165abd0095613b60fbf64d139b899ba4896a2d92

          "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51763/
          Subject: LU-16972 e2fsck: fix merging ea_inode_refs
          Project: tools/e2fsprogs
          Branch: master-lustre
          Current Patch Set:
          Commit: 35ecc72a445b04f2a257bc31cbeb310d76420b89

          gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51763/ Subject: LU-16972 e2fsck: fix merging ea_inode_refs Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: Commit: 35ecc72a445b04f2a257bc31cbeb310d76420b89

          "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51827
          Subject: LU-16972 tests: createmany sets xattr
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: b2157c6920b54e63c3aff6eae85873118482f8eb

          gerrit Gerrit Updater added a comment - "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51827 Subject: LU-16972 tests: createmany sets xattr Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: b2157c6920b54e63c3aff6eae85873118482f8eb

          "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51763
          Subject: LU-16972 e2fsck: fix merging ea_inode_refs
          Project: tools/e2fsprogs
          Branch: master-lustre
          Current Patch Set: 1
          Commit: fa29ce6fce78f90c691f5413393175b47ca5f566

          gerrit Gerrit Updater added a comment - "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51763 Subject: LU-16972 e2fsck: fix merging ea_inode_refs Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: 1 Commit: fa29ce6fce78f90c691f5413393175b47ca5f566

          I was wondering whether this code could use the "icount" structure used elsewhere in e2fsck (eg. duplicate block handling). This is optimized for a count of 0/1 in a bitmap, and puts entries into an array only for count > 1. The drawback is that it has to allocate a full bitmap of entries for all blocks counted, even though the usage would be very sparse.

          For an 10TiB MDT filesystem, the icount bitmap would be 2.5B bits, or 320MiB of RAM per icount. The current ea_refcount struct is 2x__u64=16 bytes per element, which means the break-even point would be 20M xattr blocks, which is quite high for most filesystems.

          I think the better alternative is to expand on what is in the current patch, namely that the block bitmap is used to track xattr blocks with count = 1 and then only add the ea_refcount for xattr blocks with refcount > 1. The one complexity is whether this will catch xattr blocks that are duplicated with regular file blocks, but I think it would be handled correctly anyway.

          adilger Andreas Dilger added a comment - I was wondering whether this code could use the "icount" structure used elsewhere in e2fsck (eg. duplicate block handling). This is optimized for a count of 0/1 in a bitmap, and puts entries into an array only for count > 1. The drawback is that it has to allocate a full bitmap of entries for all blocks counted, even though the usage would be very sparse. For an 10TiB MDT filesystem, the icount bitmap would be 2.5B bits, or 320MiB of RAM per icount. The current ea_refcount struct is 2x__u64=16 bytes per element, which means the break-even point would be 20M xattr blocks, which is quite high for most filesystems. I think the better alternative is to expand on what is in the current patch, namely that the block bitmap is used to track xattr blocks with count = 1 and then only add the ea_refcount for xattr blocks with refcount > 1. The one complexity is whether this will catch xattr blocks that are duplicated with regular file blocks, but I think it would be handled correctly anyway.

          "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51729
          Subject: LU-16972 e2fsck: only add xattr block if refcount > 1
          Project: tools/e2fsprogs
          Branch: master-lustre
          Current Patch Set: 1
          Commit: ca038661f5b332f4a0475eaae8bcec039b88b195

          gerrit Gerrit Updater added a comment - "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51729 Subject: LU-16972 e2fsck: only add xattr block if refcount > 1 Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: 1 Commit: ca038661f5b332f4a0475eaae8bcec039b88b195

          People

            dongyang Dongyang Li
            adilger Andreas Dilger
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: