Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15043

OST spill pools should not allow spill pool loops

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807

    Description

      Using the latest build of Lustre, 2.14.54_92 build # 4421, I created a spill pool loop and encountered some unexpected behavior.
      I created three pools and created a loop of spill pools where pool1.spill= pool2, pool2.spill=pool3 and pool3.spill=pool1. I then created a file on pool1, but the file was created on pool2. The same thing happened when I created a file on pool2 and on pool3, they were created on pool3 and pool1, respectively.

      I think we should not allow spill pool loops to be created.

      Here are more details:
      Created three pools:

      # lfs pool_list scratch.pool1
      Pool: scratch.pool1
      scratch-OST0000_UUID
      # lfs pool_list scratch.pool2
      Pool: scratch.pool2
      scratch-OST0001_UUID
      # lfs pool_list scratch.pool3
      Pool: scratch.pool3
      scratch-OST0002_UUID
      

      Set spill pool and thresholds on both MDSs:

      mds1# lctl get_param lod.scratch-MDT*.pool.*.spill*
      lod.scratch-MDT0000-mdtlov.pool.pool1.spill_is_active=1
      lod.scratch-MDT0000-mdtlov.pool.pool1.spill_target=pool2
      lod.scratch-MDT0000-mdtlov.pool.pool1.spill_threshold_pct=5
      lod.scratch-MDT0000-mdtlov.pool.pool2.spill_is_active=1
      lod.scratch-MDT0000-mdtlov.pool.pool2.spill_target=pool3
      lod.scratch-MDT0000-mdtlov.pool.pool2.spill_threshold_pct=5
      lod.scratch-MDT0000-mdtlov.pool.pool3.spill_is_active=1
      lod.scratch-MDT0000-mdtlov.pool.pool3.spill_target=pool1
      lod.scratch-MDT0000-mdtlov.pool.pool3.spill_threshold_pct=5
      lod.scratch-MDT0002-mdtlov.pool.pool1.spill_is_active=1
      lod.scratch-MDT0002-mdtlov.pool.pool1.spill_target=pool2
      lod.scratch-MDT0002-mdtlov.pool.pool1.spill_threshold_pct=5
      lod.scratch-MDT0002-mdtlov.pool.pool2.spill_is_active=1
      lod.scratch-MDT0002-mdtlov.pool.pool2.spill_target=pool3
      lod.scratch-MDT0002-mdtlov.pool.pool2.spill_threshold_pct=5
      lod.scratch-MDT0002-mdtlov.pool.pool3.spill_is_active=1
      lod.scratch-MDT0002-mdtlov.pool.pool3.spill_target=pool1
      lod.scratch-MDT0002-mdtlov.pool.pool3.spill_threshold_pct=5
      

      We see the following in dmesg on mds1, not on mds2:

      [ 9046.643396] LustreError: 5659:0:(qmt_pool.c:1390:qmt_pool_add_rem()) add to: can't scratch-QMT0000 scratch-OST0000_UUID pool pool1: rc = -17
      [ 9056.957864] LustreError: 5666:0:(qmt_pool.c:1390:qmt_pool_add_rem()) add to: can't scratch-QMT0000 scratch-OST0001_UUID pool pool2: rc = -17
      [ 9065.980468] LustreError: 5674:0:(qmt_pool.c:1390:qmt_pool_add_rem()) add to: can't scratch-QMT0000 scratch-OST0002_UUID pool pool3: rc = -17
      

      Create files on specific OST pools:

      # lfs setstripe -p pool1 -c -1 /lustre/scratch/file1
      # lfs getstripe -p /lustre/scratch/file1
      pool2
      # lfs setstripe -p pool2 -c -1 /lustre/scratch/file2
      # lfs getstripe -p /lustre/scratch/file2
      pool3
      # lfs setstripe -p pool3 -c -1 /lustre/scratch/file3
      # lfs getstripe -p /lustre/scratch/file3
      pool1
      

      We see the following on MDS0:

      [10198.677195] Lustre: 1506:0:(lod_pool.c:799:lod_check_and_spill_pool()) scratch-MDT0000-mdtlov: more than 10 levels of pool spill for 'pool1->pool2'
      [10223.616652] Lustre: 1506:0:(lod_pool.c:799:lod_check_and_spill_pool()) scratch-MDT0000-mdtlov: more than 10 levels of pool spill for 'pool2->pool3'
      [10234.693511] Lustre: 1538:0:(lod_pool.c:799:lod_check_and_spill_pool()) scratch-MDT0000-mdtlov: more than 10 levels of pool spill for 'pool3->pool1' 
      

      Attachments

        Issue Links

          Activity

            People

              bzzz Alex Zhuravlev
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: