Details

    • Technical task
    • Resolution: Fixed
    • Critical
    • Lustre 2.4.0
    • Lustre 2.4.0
    • None
    • 2
    • 4281

    Description

      Per comments in http://review.whamcloud.com/4058, we need to have some discussion/investigation on how the lod_alloc_rr() code is behaving under various circumstances. It isn't clear that it is as flexible as the previous lov alloc_rr/osc_precreate behaviour, and this is an area that has needed a lot of tuning in the past to work well under a variety of conditions.

      Attachments

        Activity

          [LU-2051] verify lod_alloc_rr() code is doing what we want

          The lod_alloc_rr() code itself was fixed to avoid races in updating the state, but it still isn't clear if things like OST object precreation is working optimally (e.g. precreate starts early enough). In particular, there was a benefit shown at LUG a long time ago to start precreate when only 1/4 of the MDS objects were used, rather than waiting until 1/2 were used.

          The ZFS DEGRADED flag setting is LU-4277.
          The QOS balance and RR balance is LU-9.

          adilger Andreas Dilger added a comment - The lod_alloc_rr() code itself was fixed to avoid races in updating the state, but it still isn't clear if things like OST object precreation is working optimally (e.g. precreate starts early enough). In particular, there was a benefit shown at LUG a long time ago to start precreate when only 1/4 of the MDS objects were used, rather than waiting until 1/2 were used. The ZFS DEGRADED flag setting is LU-4277 . The QOS balance and RR balance is LU-9 .

          Andreas, do you think we still need to do this? there was a patch from Seagate (landed now) where they improved RR and they reported it's now doing much more predictable.

          bzzz Alex Zhuravlev added a comment - Andreas, do you think we still need to do this? there was a patch from Seagate (landed now) where they improved RR and they reported it's now doing much more predictable.

          Several previous bugs come to mind:

          • ensure that precreate count is scaling with load
          • precreate starts early enough so that MDS create doesn't stall
          • skip inactive or degraded OSTs
            • ideally ZFS OSD can set this flag itself?
          • RR allocation is properly balancing objects on OSTs
          • QOS space balance works properly with imbalanced OSTs

          I think there are tests foremost of these conditions, but it would be good to verify this.

          adilger Andreas Dilger added a comment - Several previous bugs come to mind: ensure that precreate count is scaling with load precreate starts early enough so that MDS create doesn't stall skip inactive or degraded OSTs ideally ZFS OSD can set this flag itself? RR allocation is properly balancing objects on OSTs QOS space balance works properly with imbalanced OSTs I think there are tests foremost of these conditions, but it would be good to verify this.

          any specific concern ?

          bzzz Alex Zhuravlev added a comment - any specific concern ?

          People

            bzzz Alex Zhuravlev
            adilger Andreas Dilger
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: