Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10823

max_create_count triggering uneven distribution across OSTs

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.10.3
    • None
    • Centos 7
    • 2
    • 9223372036854775807

    Description

      I set two OSTs on a file system to max_create_count=0 to facilitate migrating data off of them so I could destroy and recreate the zpools. Each OST has a corrupted spacemap.

       

      The OSTs in question:

       

      iliad-OST0018_UUID   56246673664 21707936640 34538672768  39% /iliad[OST:24]
      iliad-OST0019_UUID   56246646528 21155081600 35091486464  38% /iliad[OST:25]

      During the process of running lfs_migrate, I found that the very next OST in the list, OST:26 below, was getting a disproportionate number of files assigned to it. I set max_create_count=0 for that OST as well. I ran into the same issue on OST:27. Then OST:28 as soon as setting max_create_count=0 to OST:27.

       

      The file system was previously well balanced and is about 86% used.

       

      iliad-OST0013_UUID         52.4T 46.2T 6.1T  88% /iliad[OST:19]
      iliad-OST0014_UUID         52.4T 46.2T 6.2T  88% /iliad[OST:20]
      iliad-OST0015_UUID         52.4T 46.5T 5.9T  89% /iliad[OST:21]
      iliad-OST0016_UUID         52.4T 46.3T 6.1T  88% /iliad[OST:22]
      iliad-OST0017_UUID         52.4T 45.1T 7.3T  86% /iliad[OST:23]
      iliad-OST0018_UUID         52.4T 20.1T 32.3T  38% /iliad[OST:24]
      iliad-OST0019_UUID         52.4T 19.6T 32.7T  37% /iliad[OST:25]
      iliad-OST001a_UUID         52.4T 49.1T 3.3T  94% /iliad[OST:26]
      iliad-OST001b_UUID         52.4T 51.6T 841.1G  98% /iliad[OST:27]
      iliad-OST001c_UUID         52.4T 47.6T 4.8T  91% /iliad[OST:28]

       

      Last week I upgraded the Metadata / messaging and object store servers from Centos 6 running lustre 2.7 to Centos 7 running lustre 2.10.3 with ZFS 0.7.5. The MDT is still ldiskfs.

       

      I ran a couple of quick tests to adjust the balancing which may help diagnose the issue.

       

      While actively migrating files from OST 24,25 to the rest of the FS, I tested adjusting qos_prio_free=95

       

      $ date; lfs df /iliad  | egrep "OST:(12|28)"                                                                                                        
      Fri Mar 16 19:10:10 UTC 2018
      iliad-OST000c_UUID   56246866176 49690633216  6556138368 88% /iliad[OST:12]
      iliad-OST001c_UUID   56246898304 51048918400  5191627264 91% /iliad[OST:28]

      $ date; lfs df /iliad  | egrep "OST:(12|28)"
      Fri Mar 16 19:18:11 UTC 2018
      iliad-OST000c_UUID   56246865920 49691904768  6554891648 88% /iliad[OST:12]
      iliad-OST001c_UUID   56246902912 51064366592  5182073984 91% /iliad[OST:28]

      Change for OST12: (49691904768-49690633216) / 1024 = 1241.75

      Change for OST28: (51064366592-51048918400) / 1024 = 15086.125

       

      Looks like it's allocating more than ten times the amount you'd expect – worse yet since OST28 is fuller than most. I then set qos_prio_free back to default and set qos_threshold_rr to 100 to attempt to get straight round robin balancing.

       

      $ date; lfs df /iliad  | egrep "OST:(12|28)"
      Fri Mar 16 19:18:25 UTC 2018
      iliad-OST000c_UUID   56246865920 49691904768  6554891648 88% /iliad[OST:12]
      iliad-OST001c_UUID   56246902912 51064366592  5182106624 91% /iliad[OST:28]

      $ date; lfs df /iliad  | egrep "OST:(12|28)"
      Fri Mar 16 19:32:28 UTC 2018
      iliad-OST000c_UUID   56246865280 49696041856  6550753920 88% /iliad[OST:12]
      iliad-OST001c_UUID   56246904448 51070885248  5175770368 91% /iliad[OST:28]

       

      (49696041856-49691904768) / 1024 = 4040.125

      (51070885248-51064366592) / 1024 = 6365.875

       

      That seems to be in the realm of blind round robin for written files.

       

      Perhaps max_create_count isn't taken into account during the balancing algorithm and the file goes to the next available OST in order. In this case, would I be better off deactivating the OSTs?

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jstroik Jesse Stroik
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: