Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11115

OST selection algorithm broken with max_create_count=0 or empty OSTs

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.12.0, Lustre 2.10.5
    • Lustre 2.7.0, Lustre 2.10.0
    • None
    • Server running:
      CentOS-6 lcentos-release-6-9.el6.12.3.x86_64
      lustre-2.7.3-1nasS_mofed33v3g_2.6.32_642.15.1.el6.20170609.x86_64.lustre273.x86_64
    • 3
    • 9223372036854775807

    Description

      We have blocked new object creation to some of our OSTs with commands like:

      lctl set_param osp.$OSTNAME.max_create_count=0

      This is to drain data off of storage to be repurposed as spares. Three targets are already at 0%, and confirmed to have no remaining objects with e2scan and lester. 11 other targets are blocked and data is being migrated off.

      Noticed that a few of the other targets were filling up, while others had plenty of space. Watching it over a few days and the imbalance is getting worse.

      Confirmed that we are using default allocation settings:

      nbp7-mds1 ~ # lctl get_param lov.*.qos_* 
      lov.nbp7-MDT0000-mdtlov.qos_maxage=5 Sec
      lov.nbp7-MDT0000-mdtlov.qos_prio_free=91%
      lov.nbp7-MDT0000-mdtlov.qos_threshold_rr=17%

      Tests creating 100k new files of stripe count 1 showed that the more full OSTs are indeed getting allocated objects more often.

      This looks like it might be similar to LU-10823.

       

      Attachments

        Issue Links

          Activity

            People

              yujian Jian Yu
              ndauchy Nathan Dauchy (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: