Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16938

"lfs setstripe -C -1" stripes too widely, should be limited to OST_COUNT

Details

    • Improvement
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807

    Description

      I am reaching out to seek clarification regarding the expected behavior of the "lfs setstripe" command when using the -C -1 option.

      Currently, it appears that this command is creating a higher stripe count than anticipated. For instance, on my test system, it generated a stripe count of 2727 for a single file. This count exceeds the allowed limit of LOV_MAX_STRIPE_COUNT. 

      I am uncertain about the appropriate solution to address this issue related to the "-1" argument. I have contemplated the following options:

      1.    Consider making the option -1 illegal, preventing its usage altogether.

      2.    Implement a mechanism to automatically set the stripe count to the maximum allowed value (LOV_MAX_STRIPE_COUNT) if the count exceeds this limit.

      I would greatly appreciate your input and guidance in this matter. It is worth noting that setting the stripe count higher than LOV_MAX_STRIPE_COUNT leads to other problems, such as the failure of the "llapi_layout_get_by_fd" API to open the file.

      Please let me know your input.

      Attachments

        Issue Links

          Activity

            [LU-16938] "lfs setstripe -C -1" stripes too widely, should be limited to OST_COUNT

            This is broken due to storing the negative stripe count as "stripe_count - 32" instead of just "-stripe_count" and needs to be fixed before 2.16.0 is released.

            adilger Andreas Dilger added a comment - This is broken due to storing the negative stripe count as "stripe_count - 32" instead of just "-stripe_count" and needs to be fixed before 2.16.0 is released.
            pjones Peter Jones added a comment -

            Merged for 2.16

            pjones Peter Jones added a comment - Merged for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/54192/
            Subject: LU-16938 utils: setstripe overstripe multiple OST count
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 1a6ef725c285dd5c25c976956ba754dc470f6c1c

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/54192/ Subject: LU-16938 utils: setstripe overstripe multiple OST count Project: fs/lustre-release Branch: master Current Patch Set: Commit: 1a6ef725c285dd5c25c976956ba754dc470f6c1c

            "Rajeev Mishra <rajeevm@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54192
            Subject: LU-16938 utils: enabling setstripe n multiple of ost count
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 15396dfe08e29fa02233264f44fa6861a171510e

            gerrit Gerrit Updater added a comment - "Rajeev Mishra <rajeevm@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54192 Subject: LU-16938 utils: enabling setstripe n multiple of ost count Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 15396dfe08e29fa02233264f44fa6861a171510e

            But just to be clear, the inconsistency I'm concerned about can ultimately affect files, e.g. by ending up with a MUCH larger overstripe count than perhaps was intended if one accidentally does something like this:

            jupiter-p2:/lus/kjcf08 # mkdir test
            jupiter-p2:/lus/kjcf08 # lfs setstripe --overstripe-count 10 --stripe-count -1 test
            jupiter-p2:/lus/kjcf08 # touch test/foo
            jupiter-p2:/lus/kjcf08 # lfs getstripe test | head
            test
            stripe_count:  -1 stripe_size:   1048576 pattern:       raid0,overstriped stripe_offset: -1
            
            test/foo
            lmm_stripe_count:  2727
            lmm_stripe_size:   1048576
            lmm_pattern:       raid0,overstriped
            lmm_layout_gen:    0
            lmm_stripe_offset: 0
            	obdidx		 objid		 objid		 group
            

            (note the 2727 value is because I don't have Rajeev's other fix on this system, but on the latest code this would be 2000... still probably not what was expected on a system with 2 OSTs).

            jschwartz Josh Schwartz added a comment - But just to be clear, the inconsistency I'm concerned about can ultimately affect files, e.g. by ending up with a MUCH larger overstripe count than perhaps was intended if one accidentally does something like this: jupiter-p2:/lus/kjcf08 # mkdir test jupiter-p2:/lus/kjcf08 # lfs setstripe --overstripe-count 10 --stripe-count -1 test jupiter-p2:/lus/kjcf08 # touch test/foo jupiter-p2:/lus/kjcf08 # lfs getstripe test | head test stripe_count: -1 stripe_size: 1048576 pattern: raid0,overstriped stripe_offset: -1 test/foo lmm_stripe_count: 2727 lmm_stripe_size: 1048576 lmm_pattern: raid0,overstriped lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group (note the 2727 value is because I don't have Rajeev's other fix on this system, but on the latest code this would be 2000... still probably not what was expected on a system with 2 OSTs).

            OK, that's good, then.  The user interface is important but I was more concerned that the server might be marking the layout incorrectly.  Obviously default layouts are a different case.

            paf0186 Patrick Farrell added a comment - OK, that's good, then.  The user interface is important but I was more concerned that the server might be marking the layout incorrectly.  Obviously default layouts are a different case.
            jschwartz Josh Schwartz added a comment - - edited

            I don't think that is coming into play here because I'm just showing the default striping on a directory. If I actually create a file within the directory I believe it is behaving as you suggest:

            jupiter-p2:/lus/kjcf08 # mkdir test
            jupiter-p2:/lus/kjcf08 # lfs setstripe --overstripe-count 1024 --stripe-count 10 test
            jupiter-p2:/lus/kjcf08 # touch test/foo
            jupiter-p2:/lus/kjcf08 # lfs getstripe test | head
            test
            stripe_count:  10 stripe_size:   1048576 pattern:       raid0,overstriped stripe_offset: -1
            
            test/foo
            lmm_stripe_count:  10
            lmm_stripe_size:   1048576
            lmm_pattern:       raid0,overstriped
            lmm_layout_gen:    0
            lmm_stripe_offset: 1
            	obdidx		 objid		 objid		 group
            

            here the file is overstriped because I only have 2 OSTs.

            This is a bit of a degenerative example, but if I just set the --overstripe-count 2 the directory will have a default of overstriped with a stripe count of 2, but files that are created are not overstriped (and have a stripe count of 2):

            jupiter-p2:/lus/kjcf08 # lfs setstripe --overstripe-count 2 test
            jupiter-p2:/lus/kjcf08 # lfs getstripe -d test
            stripe_count:  2 stripe_size:   1048576 pattern:       raid0,overstriped stripe_offset: -1
            jupiter-p2:/lus/kjcf08 # touch test/foo
            jupiter-p2:/lus/kjcf08 # lfs getstripe test/foo
            test/foo
            lmm_stripe_count:  2
            lmm_stripe_size:   1048576
            lmm_pattern:       raid0
            lmm_layout_gen:    0
            lmm_stripe_offset: 1
            	obdidx		 objid		 objid		 group
            	     1	     116959791	    0x6f8aa2f	             0
            	     0	     117253333	    0x6fd24d5	             0
            

            so I think that part of it is working OK.

            jschwartz Josh Schwartz added a comment - - edited I don't think that is coming into play here because I'm just showing the default striping on a directory. If I actually create a file within the directory I believe it is behaving as you suggest: jupiter-p2:/lus/kjcf08 # mkdir test jupiter-p2:/lus/kjcf08 # lfs setstripe --overstripe-count 1024 --stripe-count 10 test jupiter-p2:/lus/kjcf08 # touch test/foo jupiter-p2:/lus/kjcf08 # lfs getstripe test | head test stripe_count: 10 stripe_size: 1048576 pattern: raid0,overstriped stripe_offset: -1 test/foo lmm_stripe_count: 10 lmm_stripe_size: 1048576 lmm_pattern: raid0,overstriped lmm_layout_gen: 0 lmm_stripe_offset: 1 obdidx objid objid group here the file is overstriped because I only have 2 OSTs. This is a bit of a degenerative example, but if I just set the --overstripe-count 2 the directory will have a default of overstriped with a stripe count of 2, but files that are created are not overstriped (and have a stripe count of 2): jupiter-p2:/lus/kjcf08 # lfs setstripe --overstripe-count 2 test jupiter-p2:/lus/kjcf08 # lfs getstripe -d test stripe_count: 2 stripe_size: 1048576 pattern: raid0,overstriped stripe_offset: -1 jupiter-p2:/lus/kjcf08 # touch test/foo jupiter-p2:/lus/kjcf08 # lfs getstripe test/foo test/foo lmm_stripe_count: 2 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 1 obdidx objid objid group 1 116959791 0x6f8aa2f 0 0 117253333 0x6fd24d5 0 so I think that part of it is working OK.

            There is code in lod_ost_alloc_rr() in the MDS object allocation that should be removing the LOV_PATTERN_OVERSTRIPING flag if it is set unnecessarily:

                    /* If there are enough OSTs, a component with overstriping requested
                     * will not actually end up overstriped.  The comp should reflect this.
                     */
                    if (!overstriped)
                            lod_comp->llc_pattern &= ~LOV_PATTERN_OVERSTRIPING;
            

            If this isn't being applied consistently, then that would be a bug.

            adilger Andreas Dilger added a comment - There is code in lod_ost_alloc_rr() in the MDS object allocation that should be removing the LOV_PATTERN_OVERSTRIPING flag if it is set unnecessarily: /* If there are enough OSTs, a component with overstriping requested * will not actually end up overstriped. The comp should reflect this . */ if (!overstriped) lod_comp->llc_pattern &= ~LOV_PATTERN_OVERSTRIPING; If this isn't being applied consistently, then that would be a bug.

            Josh,

            Makes sense to me.  There's also another possible bug here - how many OSTs do you have on that system?  If it's >= 10, then overstriped shouldn't be set by the server code either, which is also a concern.  Overstriping should only be set on the file when the actual file striping exceeds the number of available OSTs.  (Or at least that was the intent...)

            So there may be two things to fix there - proper overriding by later parameters in userspace, so the overstriping flag isn't passed along, and then - if you have >= 10 OSTs, then the server shouldn't set the overstriping pattern regardless of what userspace asked for.  If you have 20 OSTs and give -C 10, overstriping shouldn't be set, because the file is not actually overstriped.  Overstriping set on a not-overstriped file isn't fatal, but it's definitely wrong.

            paf0186 Patrick Farrell added a comment - Josh, Makes sense to me.  There's also another possible bug here - how many OSTs do you have on that system?  If it's >= 10, then overstriped shouldn't be set by the server code either, which is also a concern.  Overstriping should only be set on the file when the actual file striping exceeds the number of available OSTs.  (Or at least that was the intent...) So there may be two things to fix there - proper overriding by later parameters in userspace, so the overstriping flag isn't passed along, and then - if you have >= 10 OSTs, then the server shouldn't set the overstriping pattern regardless of what userspace asked for.  If you have 20 OSTs and give -C 10, overstriping shouldn't be set, because the file is not actually overstriped.  Overstriping set on a not-overstriped file isn't fatal, but it's definitely wrong.
            jschwartz Josh Schwartz added a comment - - edited

            > Like many utilities, the last option specified will take precedence.

            I would be fine with either (mutually exclusive or last takes precedence in its entirety) but this bothers me:

            jupiter-p2:/lus/kjcf08 # mkdir test
            jupiter-p2:/lus/kjcf08 # lfs setstripe --overstripe-count 1024 --stripe-count 10 test
            jupiter-p2:/lus/kjcf08 # lfs getstripe -d test
            test
            stripe_count:  10 stripe_size:   1048576 pattern:       raid0,overstriped stripe_offset: -1
            

            Note that we got (and kept) overstriped from the first param, but then picked up the count from the second. If the last option truly took precedence I would expect a stripe count of 10 without overstriped (just like if the first one took precedence I would expect a stripe count of 1024 with overstriped).

            It is inconsistent that the behavior is different if you issue them individually, but in the same order:

            jupiter-p2:/lus/kjcf08 # lfs setstripe --overstripe-count 1024 test
            jupiter-p2:/lus/kjcf08 # lfs getstripe -d test
            stripe_count:  1024 stripe_size:   1048576 pattern:       raid0,overstriped stripe_offset: -1
            jupiter-p2:/lus/kjcf08 # lfs setstripe --stripe-count 10 test
            jupiter-p2:/lus/kjcf08 # lfs getstripe -d test
            stripe_count:  10 stripe_size:   1048576 pattern:       raid0 stripe_offset: -1
            

            Here each command does as I would expect; --overstripe-count 1024 by itself yields overstriped and stripe count 1024, and --stripe-count 10 by itself on the same directory removes overstriped (which is what I would expect) yielding stripe count 10 without overstriped.

            The fact that combining them causes it to take the overstriped from the first param and the stripe count from the second is surprising. --stripe-count explicitly means not-overstriped and if the rule is that the last one takes precedence, then it should be like the --overstripe-count wasn't there at all instead of the --stripe-count acting as a modifier.

            jschwartz Josh Schwartz added a comment - - edited > Like many utilities, the last option specified will take precedence. I would be fine with either (mutually exclusive or last takes precedence in its entirety ) but this bothers me: jupiter-p2:/lus/kjcf08 # mkdir test jupiter-p2:/lus/kjcf08 # lfs setstripe --overstripe-count 1024 --stripe-count 10 test jupiter-p2:/lus/kjcf08 # lfs getstripe -d test test stripe_count: 10 stripe_size: 1048576 pattern: raid0,overstriped stripe_offset: -1 Note that we got (and kept) overstriped from the first param, but then picked up the count from the second. If the last option truly took precedence I would expect a stripe count of 10 without overstriped (just like if the first one took precedence I would expect a stripe count of 1024 with overstriped). It is inconsistent that the behavior is different if you issue them individually, but in the same order: jupiter-p2:/lus/kjcf08 # lfs setstripe --overstripe-count 1024 test jupiter-p2:/lus/kjcf08 # lfs getstripe -d test stripe_count: 1024 stripe_size: 1048576 pattern: raid0,overstriped stripe_offset: -1 jupiter-p2:/lus/kjcf08 # lfs setstripe --stripe-count 10 test jupiter-p2:/lus/kjcf08 # lfs getstripe -d test stripe_count: 10 stripe_size: 1048576 pattern: raid0 stripe_offset: -1 Here each command does as I would expect; --overstripe-count 1024 by itself yields overstriped and stripe count 1024, and --stripe-count 10 by itself on the same directory removes overstriped (which is what I would expect) yielding stripe count 10 without overstriped. The fact that combining them causes it to take the overstriped from the first param and the stripe count from the second is surprising. --stripe-count explicitly means not-overstriped and if the rule is that the last one takes precedence, then it should be like the --overstripe-count wasn't there at all instead of the --stripe-count acting as a modifier.

            Interesting, OK!  Happy to defer.  I wasn't familiar with "last option takes precedence".

            paf0186 Patrick Farrell added a comment - Interesting, OK!  Happy to defer.  I wasn't familiar with "last option takes precedence".

            People

              rajeevm Rajeev Mishra
              rajeevm Rajeev Mishra
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: