Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13440

DNE3: limit directory default layout inheritance

Details

    • Improvement
    • Resolution: Fixed
    • Major
    • Lustre 2.15.0
    • None
    • 9223372036854775807

    Description

      One problem that exists today is that default directory layouts are inherited by all new subdirectories created in the filesystem. That makes it difficult to set e.g. "lfs setdirstripe -D -c 1 -i -1" on the root directory and maybe a second level of directories without having it inherited by all of the subdirectories for the whole filesystem.

      It would be useful to add a option like "lfs setdirstripe --max-inherit" that stores "lmv_max_inherit" on the default directory layout so that it is only copied down that many levels of subdirectories before it is not copied. The lmv_max_inherit would be decremented each time it is copied down to a new subdirectory, so there is no need to track the parent layout.

      For compatibility, "lmv_max_inherit=0" would mean "copy forever", so "lmv_max_inherit=1" would mean "do not copy default layout". We don't need huge values here (e.g. "lmv_max_inherit=255" would be totally fine).

      I don't think we need to do anything incompatible for older MDS nodes (e.g. we don't need to use a different LMV magic), since at worst the old MDS will copy this layout forever (ignoring lmv_max_inherit) and have the same behaviour as before this feature existed. Probably the easiest would be to split a __u8 field out of lum_padding1 and leave an unused __u8 and __u16 for future use.

      Attachments

        Issue Links

          Activity

            [LU-13440] DNE3: limit directory default layout inheritance

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43530/
            Subject: LU-13440 utils: fix handling of lsa_stripe_off = -1
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 1dbe63301b8c5cb7f7d0fe9960cafd3cd0e45534

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43530/ Subject: LU-13440 utils: fix handling of lsa_stripe_off = -1 Project: fs/lustre-release Branch: master Current Patch Set: Commit: 1dbe63301b8c5cb7f7d0fe9960cafd3cd0e45534

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43131/
            Subject: LU-13440 lmv: add default LMV inherit depth
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 01d34a6b3b2e34f7414f627e4f87993322dafa78

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43131/ Subject: LU-13440 lmv: add default LMV inherit depth Project: fs/lustre-release Branch: master Current Patch Set: Commit: 01d34a6b3b2e34f7414f627e4f87993322dafa78

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43385/
            Subject: LU-13440 obdclass: server qos penalty miscaculated
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 0ccce7ecb72f847f4235a513424d90119edad7ca

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43385/ Subject: LU-13440 obdclass: server qos penalty miscaculated Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0ccce7ecb72f847f4235a513424d90119edad7ca

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43530
            Subject: LU-13440 utils: fix handling of lsa_stripe_off = -1
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 792fa045a1975a1a18af0d72470134e5bf997d6a

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43530 Subject: LU-13440 utils: fix handling of lsa_stripe_off = -1 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 792fa045a1975a1a18af0d72470134e5bf997d6a

            I think the main goal here is to allow users to get reasonable MDT balancing without significant effort. For new filesystems, I think the current patch is relatively good, but we also need a way to handle this for existing filesystems without the need to explicitly set a layout on every subdirectory (which would also be complex because the "inherit depth" would need to be changed each time, if not unlimited).

            For default layout inheritance from the root directory, one problem that we've seen with file layout inheritance is if the default layout xattr is copied to each subdirectory, then it is difficult to change the default afterward without changing it in every directory in the filesystem, except directories that had a different layout explicitly set on them. If the root default layout has (lum_max_inherit = LMV_INHERIT_UNLIMITED) then there is no need to copy the layout to the subdirectories at all, since it could just be cached on the root directory. Also, we could assume for a root default layout even with (lum_max_inherit_rr != LMV_INHERIT_UNLIMITED) that if the parent directory does not have a layout, then we have exceeded lum_max_inherit_rr and no copy of the default layout is needed. Only the top lum_max_inherit_rr directories would get an explicit xattr copy.

            For existing filesystems (which will almost certainly already have MDT imbalance), it probably makes sense to skip the RR phase entirely and set a default "-c 1 -i -1 --max-inherit=-1" default layout on the root directory, and make the automatic balancing of new directories in the whole filesystem "smart enough" (i.e. stick with parent MDT unless MDTs are imbalanced, probability of remote directory depends on imbalance between MDTs).

            One option (in a follow-on patch) would be to track the "depth" of every directory in memory and then use this to determine whether the rr applies or not? That avoids the need to copy the layout explicitly for subdirectories, since it can ignore RR mode if depth > max_inherit_rr. The probability of creating a subdirectory on a remote MDT would depend on the imbalance between MDTs and also the depth (higher-level directories are more likely to be remote).

            adilger Andreas Dilger added a comment - I think the main goal here is to allow users to get reasonable MDT balancing without significant effort. For new filesystems, I think the current patch is relatively good, but we also need a way to handle this for existing filesystems without the need to explicitly set a layout on every subdirectory (which would also be complex because the "inherit depth" would need to be changed each time, if not unlimited). For default layout inheritance from the root directory, one problem that we've seen with file layout inheritance is if the default layout xattr is copied to each subdirectory, then it is difficult to change the default afterward without changing it in every directory in the filesystem, except directories that had a different layout explicitly set on them. If the root default layout has (lum_max_inherit = LMV_INHERIT_UNLIMITED) then there is no need to copy the layout to the subdirectories at all, since it could just be cached on the root directory. Also, we could assume for a root default layout even with (lum_max_inherit_rr != LMV_INHERIT_UNLIMITED) that if the parent directory does not have a layout, then we have exceeded lum_max_inherit_rr and no copy of the default layout is needed. Only the top lum_max_inherit_rr directories would get an explicit xattr copy. For existing filesystems (which will almost certainly already have MDT imbalance), it probably makes sense to skip the RR phase entirely and set a default "-c 1 -i -1 --max-inherit=-1" default layout on the root directory, and make the automatic balancing of new directories in the whole filesystem "smart enough" (i.e. stick with parent MDT unless MDTs are imbalanced, probability of remote directory depends on imbalance between MDTs). One option (in a follow-on patch) would be to track the "depth" of every directory in memory and then use this to determine whether the rr applies or not? That avoids the need to copy the layout explicitly for subdirectories, since it can ignore RR mode if depth > max_inherit_rr. The probability of creating a subdirectory on a remote MDT would depend on the imbalance between MDTs and also the depth (higher-level directories are more likely to be remote).

            Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43385
            Subject: LU-13440 obdclass: server qos penalty miscaculated
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 809bd318183f9b14cccf04f10e34b7b367f19e53

            gerrit Gerrit Updater added a comment - Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43385 Subject: LU-13440 obdclass: server qos penalty miscaculated Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 809bd318183f9b14cccf04f10e34b7b367f19e53

            Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43131
            Subject: LU-13440 lmv: add default LMV inherit depth
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 4578bdf0091c7061328264b66f05f54b048da94d

            gerrit Gerrit Updater added a comment - Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43131 Subject: LU-13440 lmv: add default LMV inherit depth Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 4578bdf0091c7061328264b66f05f54b048da94d
            adilger Andreas Dilger added a comment - - edited

            Lai, could you please look into this next, whether it is possible to implement this in a relatively simple manner. We still need something that will "more automatically" distribute the load across MDTs, even if directory split is not active. It doesn't have to be perfect, but at least work with relatively little input from the admins if the MDTs become really imbalanced. I can think of two relatively straight forward options, and we might consider to implement both if they are not too complex:

            • the "limited inheritance" change described in this ticket would allow to e.g. set "default remote directory" ("-D -c 1 -i -1") on the root (or any) directory and then have it inherited for 2-3 directory levels before it reverts to "local" directories again. This would allow "-D -c1 -i -1 -X 3" or even "-D -c4 -i -1 -X 3" to be set on the root and spread the top of the tree widely, so that all MDTs are used, and then the lower levels stay local to their MDTs.
            • we could allow setting "-D -c 1 -i -1" on the root directory and have an MDS tunable parameter to inherit the root directory layout for the whole filesystem. That would need a bit of a change to the "always round-robin remote directories" so that it would only create remote directories if the MDTs are imbalanced, and prefer to create local directories if the MDT balance is good. Maybe limit the "round-robin" to the root directory or the top-level directory?
            adilger Andreas Dilger added a comment - - edited Lai, could you please look into this next, whether it is possible to implement this in a relatively simple manner. We still need something that will "more automatically" distribute the load across MDTs, even if directory split is not active. It doesn't have to be perfect, but at least work with relatively little input from the admins if the MDTs become really imbalanced. I can think of two relatively straight forward options, and we might consider to implement both if they are not too complex: the "limited inheritance" change described in this ticket would allow to e.g. set "default remote directory" (" -D -c 1 -i -1 ") on the root (or any) directory and then have it inherited for 2-3 directory levels before it reverts to "local" directories again. This would allow " -D -c1 -i -1 -X 3 " or even " -D -c4 -i -1 -X 3 " to be set on the root and spread the top of the tree widely, so that all MDTs are used, and then the lower levels stay local to their MDTs. we could allow setting " -D -c 1 -i -1 " on the root directory and have an MDS tunable parameter to inherit the root directory layout for the whole filesystem. That would need a bit of a change to the "always round-robin remote directories" so that it would only create remote directories if the MDTs are imbalanced, and prefer to create local directories if the MDT balance is good. Maybe limit the "round-robin" to the root directory or the top-level directory?

            People

              laisiyao Lai Siyao
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: