Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      The MDT space balance heuristics in lmv_locate_tgt_qos() has an unstable behavior.

      If the MDTs are balanced, then it returns -EAGAIN and the normal round-robin code is in effect and working properly. However, once the MDTs are imbalanced more than lq_threshold_rr, then the QOS code becomes active. It has an extra check that tries to keep subdirectory creation local to the same MDT when it is deep in the directory tree, to avoid creating too many remote directories:

      static struct lu_tgt_desc *lmv_locate_tgt_qos(struct lmv_obd *lmv, __u32 *mdt,
                                                    unsigned short dir_depth)
      {
      
              /* if current MDT has above-average space, within range of the QOS
               * threshold, stay on the same MDT to avoid creating needless remote
               * MDT directories. It's more likely for low level directories.
               */
              rand = total_avail * (256 - lmv->lmv_qos.lq_threshold_rr) /
                     (total_usable * 256 * (1 + dir_depth / 4));
              if (cur && cur->ltd_qos.ltq_avail >= rand) {
                      tgt = cur;
                      GOTO(unlock, tgt);
              }
      

      There are three factors that make up "rand":

      • total_avail / total_usable is the average (mean) free space across all MDTs
      • (256 - lmv->lmv_qos.lq_threshold_rr) / 256 reduces the average free space slightly (e.g. 95% by default) so the MDT is still considered "balanced" if within qos_threshold_rr of the average MDT free space
      • (1 + dir_depth / 4) is to keep deeper subdirectories on the same MDT as the parent

      I think the dir_depth factor makes "rand" too small (reduced by 1/2 when dir_depth >= 4), which results in ltq_avail > average / 2 and the client always selects the parent MDT until it has half as much free space as the other MDTs. This factor is even stronger for subdirectories created below that level. The "(1 + dir_depth / 4)" factor should only reduce rand slightly for each increase in subdirectory depth. For example, (16 / (dir_depth + 10)), gives 16/10 = 160% of average at root (i.e. no preference for parent MDT unless it has 60% more than average free space), 16/16 = 100% of average free space at 6 levels deep, 16/22=72% of average at 12 levels deep, and 16/32 = 50% of average at 22 levels deep).

      Also, rather than always returning "tgt = cur" in the "prefer parent" case, it could return -EAGAIN and leave it up to lmv_locate_tgt() to decide to use round-robin or same-MDT allocation for the directory. However, that would need to teach lmv_locate_tgt_rr() to skip MDTs with more than average free space if ltd_qos_is_usable() is true, which is a more complex change. Something along the lines of "skip MDT N times if it has N times less free space than (most_free - average)", preferably taking qos_prio_free into account, but this needs more thought and should be implemented in a second patch.

      Attachments

        Issue Links

          Activity

            [LU-15216] improve MDT QOS space balance
            pjones Peter Jones added a comment -

            Landed for 2.15

            pjones Peter Jones added a comment - Landed for 2.15

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45544/
            Subject: LU-15216 lmv: improve MDT QOS space balance
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 38c4c538f53fb5f0c7f6db6d4970c491184e81a0

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45544/ Subject: LU-15216 lmv: improve MDT QOS space balance Project: fs/lustre-release Branch: master Current Patch Set: Commit: 38c4c538f53fb5f0c7f6db6d4970c491184e81a0

            "Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45544
            Subject: LU-15216 lmv: improve MDT QOS space balance
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 3a444e7f7168258c1d72dee8ce37526bf881fe0e

            gerrit Gerrit Updater added a comment - "Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45544 Subject: LU-15216 lmv: improve MDT QOS space balance Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 3a444e7f7168258c1d72dee8ce37526bf881fe0e

            People

              laisiyao Lai Siyao
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: