Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8653

broken inheritance of default striping

Details

    • Bug
    • Resolution: Not a Bug
    • Trivial
    • Lustre 2.11.0, Lustre 2.10.4
    • Lustre 2.10.0
    • None
    • 3
    • 9223372036854775807

    Description

      I think we have a problem with inheritance in the following patch
      http://review.whamcloud.com/#/c/19041/23/lustre/lod/lod_object.c,unified
      from LU-7660

      The problem code is below:

      3117	+»       /* if parent doesn't provide all defaults, striping from fs default */
      3118	+»       if (d->lod_md_root != NULL &&
      3119	+»           (lc->ldo_stripenr == 0 ||
      3120	+»            lc->ldo_stripe_size == 0 ||
      3121	+»            lc->ldo_stripe_offset == LOV_OFFSET_DEFAULT ||
      3122	+»            lc->ldo_pool == NULL)) {
      3123	+»       »       memset(lds, 0, sizeof(*lds));
      3124	+»       »       lod_get_default_lov_striping(env, d->lod_md_root, lds);
      3125	+»       »       lod_striping_from_default(lc, lds, child_mode);
      3126	+»       }
      

      while it is correct about stripe number, count and offset checks, it is not correct about pool check. Since pool is optional value, it is not right to ignore parent layout just because it has no pool.

      For example, I set some striping on the directory and expect all files/subdirs inside to have the same striping, but in fact that will not work, striping will be taken from fs root. Moreover, default fs striping may have no pool as well but still be used prior the parent striping.

      I've encountered such problem while porting DoM patches, because it is quite visible when DOM layout is used or not as expected

      Attachments

        Issue Links

          Activity

            [LU-8653] broken inheritance of default striping

            Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/29142
            Subject: LU-8653 lod: use stripe_count instead of stripe_nr
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set: 1
            Commit: 8fcde8f199bd380d14608189bfe199bef61db461

            gerrit Gerrit Updater added a comment - Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/29142 Subject: LU-8653 lod: use stripe_count instead of stripe_nr Project: fs/lustre-release Branch: b2_10 Current Patch Set: 1 Commit: 8fcde8f199bd380d14608189bfe199bef61db461

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/26681/
            Subject: LU-8653 lod: use stripe_count instead of stripe_nr
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 53d3890f16e4c6e5717c4d020ef213897a36c2b7

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/26681/ Subject: LU-8653 lod: use stripe_count instead of stripe_nr Project: fs/lustre-release Branch: master Current Patch Set: Commit: 53d3890f16e4c6e5717c4d020ef213897a36c2b7

            Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/26681
            Subject: LU-8653 lod: use stripe_count instead of stripe_nr
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fbcda1afa6bd2faedbfedcc844a6358d8a7ece01

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/26681 Subject: LU-8653 lod: use stripe_count instead of stripe_nr Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fbcda1afa6bd2faedbfedcc844a6358d8a7ece01

            Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/23353
            Subject: LU-8653 lod: use stripe_count instead of stripe_nr
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d578a7916c10cb5353280a77bb2b9e10eb5638e9

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/23353 Subject: LU-8653 lod: use stripe_count instead of stripe_nr Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d578a7916c10cb5353280a77bb2b9e10eb5638e9
            tappro Mikhail Pershin added a comment - - edited

            default values have stripe count > 0 if not specified otherwise. I got the idea, thanks., and I am going to close this bug.

            Just noticed your last comment, yes, that is what I saw too, and I am doing the same now - if striping is set with LFS it always get layout RAID0 if other is not specified with -P ..., pattern is inherited only if unset.
            DOM pattern is always used with stripenr 0, so stripenr is number of stripes over OSTs only, pattern itself indicates that there is stripe on MDT and it can be only one, so this is OK. Alos this make things easier in many cases in LOV/LOD code, where we have cycles over stripenr with calls to osc/osp, they are just skipped naturally with DOM.

            tappro Mikhail Pershin added a comment - - edited default values have stripe count > 0 if not specified otherwise. I got the idea, thanks., and I am going to close this bug. Just noticed your last comment, yes, that is what I saw too, and I am doing the same now - if striping is set with LFS it always get layout RAID0 if other is not specified with -P ... , pattern is inherited only if unset. DOM pattern is always used with stripenr 0 , so stripenr is number of stripes over OSTs only, pattern itself indicates that there is stripe on MDT and it can be only one, so this is OK. Alos this make things easier in many cases in LOV/LOD code, where we have cycles over stripenr with calls to osc/osp, they are just skipped naturally with DOM.

            Hmm, looking at mkfs_lustre.c I see that set_defaults() is setting mo_stripe_count=1 explicitly, rather than setting it to 0 and letting the MDS decide what the default stripe count should be, which means that the stripe count can't be inherited (or should be ignored?) if the layout pattern LOV_PATTERN_MDT is specified, and conversely LOV_PATTERN_MDT shouldn't be inherited if the stripe count is explicitly specified...

            adilger Andreas Dilger added a comment - Hmm, looking at mkfs_lustre.c I see that set_defaults() is setting mo_stripe_count=1 explicitly, rather than setting it to 0 and letting the MDS decide what the default stripe count should be, which means that the stripe count can't be inherited (or should be ignored?) if the layout pattern LOV_PATTERN_MDT is specified, and conversely LOV_PATTERN_MDT shouldn't be inherited if the stripe count is explicitly specified...

            At some point I'd also like to have MDT pools, for example to restrict DOM files to a specific group of MDTs.

            I think that inheriting the OST pool would be useful if the file grows beyond the DOM limit.

            As for pattern inheritance, that is tricky, since having a special case to treat DOM inheritance differently may also be confusing to users. I guess if the stripe count is > 1 it makes sense to not use DOM since it can't be striped across multiple MDTs. Having stripe > 0 should be the same, but that depends on how filesystems store the default file layout - do they use stripe_count = 0 by default, and that is replaced with stripe_count = 1 when the layout is being instantiated, or is the default file layout stored with stripe_count = 1?

            adilger Andreas Dilger added a comment - At some point I'd also like to have MDT pools, for example to restrict DOM files to a specific group of MDTs. I think that inheriting the OST pool would be useful if the file grows beyond the DOM limit. As for pattern inheritance, that is tricky, since having a special case to treat DOM inheritance differently may also be confusing to users. I guess if the stripe count is > 1 it makes sense to not use DOM since it can't be striped across multiple MDTs. Having stripe > 0 should be the same, but that depends on how filesystems store the default file layout - do they use stripe_count = 0 by default, and that is replaced with stripe_count = 1 when the layout is being instantiated, or is the default file layout stored with stripe_count = 1?

            Also it is not clear about how to inherit pool for DOM files since they are not placed on OSTs, so it can be skipped and kept as NULL or maybe it is OK to inherit the default one from FS? just in case this file will grow and got stripes on OSTs.

            tappro Mikhail Pershin added a comment - Also it is not clear about how to inherit pool for DOM files since they are not placed on OSTs, so it can be skipped and kept as NULL or maybe it is OK to inherit the default one from FS? just in case this file will grow and got stripes on OSTs.

            I see, lod_striping_from_default() sets only missed values, so it will never overwrite already set values. With pool I also agree. So this ticket is not valid I think, sorry for this noise.

            Meanwhile, my problem appeared because of pattern inheritance. For example, DOM is set on parent, but in it file is created with setstripe -c 1.
            In that case stripe count is set and remains at 1, but pattern is inherited as DOM and I got assertion that DOM layout has 1 stripe count. I think that pattern shouldn't be inherited but set to RAID0 explicitly if created with lfs setstripe without '-P mdt'.

            tappro Mikhail Pershin added a comment - I see, lod_striping_from_default() sets only missed values, so it will never overwrite already set values. With pool I also agree. So this ticket is not valid I think, sorry for this noise. Meanwhile, my problem appeared because of pattern inheritance. For example, DOM is set on parent, but in it file is created with setstripe -c 1. In that case stripe count is set and remains at 1, but pattern is inherited as DOM and I got assertion that DOM layout has 1 stripe count. I think that pattern shouldn't be inherited but set to RAID0 explicitly if created with lfs setstripe without '-P mdt'.

            The layout should only be inherited from the root if there is no layout explicitly supplied for the file AND no layout on the parent. If the parent layout has, for example, stripe_count = 4 and stripe_size=0, then the file should use stripe_count = 4, and inherit the stripe_size from the filesystem default. Similarly, if the parent has no pool, but the root specifies a pool, then it should inherit the pool from the root layout.

            What it shouldn't do is inherit the whole layout from the root if some part of the parent is specified.

            Looking at lod_ah_init() it appears that it will inherit all of the layout from the parent layout if possible (i.e. if (likely(parent)) block), and if there are still any unset values it will try to inherit them from the root layout, which seems correct to me. Since the default filesystem layout does not specify a pool, if a filesystem default pool is specified by the admin then I think it makes sense to use it if no pool is otherwise specified. For example, if a filesystem has a mix of HDD and SSD OSTs, and the admin specifies the filesystem default pool to be the HDD OSTs to avoid the SSD OSTs filling up with files that shouldn't be there, I don't think it makes sense to allow a user to specify a directory-level layout without a pool and suddenly all files in that directory are striped across all OSTs in the filesystem.

            Mike, could you please provide some specific examples of what problem you are having, so I can better understand it.

            adilger Andreas Dilger added a comment - The layout should only be inherited from the root if there is no layout explicitly supplied for the file AND no layout on the parent. If the parent layout has, for example, stripe_count = 4 and stripe_size=0, then the file should use stripe_count = 4, and inherit the stripe_size from the filesystem default. Similarly, if the parent has no pool, but the root specifies a pool, then it should inherit the pool from the root layout. What it shouldn't do is inherit the whole layout from the root if some part of the parent is specified. Looking at lod_ah_init() it appears that it will inherit all of the layout from the parent layout if possible (i.e. if (likely(parent)) block), and if there are still any unset values it will try to inherit them from the root layout, which seems correct to me. Since the default filesystem layout does not specify a pool, if a filesystem default pool is specified by the admin then I think it makes sense to use it if no pool is otherwise specified. For example, if a filesystem has a mix of HDD and SSD OSTs, and the admin specifies the filesystem default pool to be the HDD OSTs to avoid the SSD OSTs filling up with files that shouldn't be there, I don't think it makes sense to allow a user to specify a directory-level layout without a pool and suddenly all files in that directory are striped across all OSTs in the filesystem. Mike, could you please provide some specific examples of what problem you are having, so I can better understand it.

            People

              wc-triage WC Triage
              tappro Mikhail Pershin
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: