Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.15.0
-
None
-
3
-
9223372036854775807
Description
Using the latest build of Lustre, 2.14.54_92 build # 4421, I created a spill pool loop and encountered some unexpected behavior.
I created three pools and created a loop of spill pools where pool1.spill= pool2, pool2.spill=pool3 and pool3.spill=pool1. I then created a file on pool1, but the file was created on pool2. The same thing happened when I created a file on pool2 and on pool3, they were created on pool3 and pool1, respectively.
I think we should not allow spill pool loops to be created.
Here are more details:
Created three pools:
# lfs pool_list scratch.pool1 Pool: scratch.pool1 scratch-OST0000_UUID # lfs pool_list scratch.pool2 Pool: scratch.pool2 scratch-OST0001_UUID # lfs pool_list scratch.pool3 Pool: scratch.pool3 scratch-OST0002_UUID
Set spill pool and thresholds on both MDSs:
mds1# lctl get_param lod.scratch-MDT*.pool.*.spill* lod.scratch-MDT0000-mdtlov.pool.pool1.spill_is_active=1 lod.scratch-MDT0000-mdtlov.pool.pool1.spill_target=pool2 lod.scratch-MDT0000-mdtlov.pool.pool1.spill_threshold_pct=5 lod.scratch-MDT0000-mdtlov.pool.pool2.spill_is_active=1 lod.scratch-MDT0000-mdtlov.pool.pool2.spill_target=pool3 lod.scratch-MDT0000-mdtlov.pool.pool2.spill_threshold_pct=5 lod.scratch-MDT0000-mdtlov.pool.pool3.spill_is_active=1 lod.scratch-MDT0000-mdtlov.pool.pool3.spill_target=pool1 lod.scratch-MDT0000-mdtlov.pool.pool3.spill_threshold_pct=5 lod.scratch-MDT0002-mdtlov.pool.pool1.spill_is_active=1 lod.scratch-MDT0002-mdtlov.pool.pool1.spill_target=pool2 lod.scratch-MDT0002-mdtlov.pool.pool1.spill_threshold_pct=5 lod.scratch-MDT0002-mdtlov.pool.pool2.spill_is_active=1 lod.scratch-MDT0002-mdtlov.pool.pool2.spill_target=pool3 lod.scratch-MDT0002-mdtlov.pool.pool2.spill_threshold_pct=5 lod.scratch-MDT0002-mdtlov.pool.pool3.spill_is_active=1 lod.scratch-MDT0002-mdtlov.pool.pool3.spill_target=pool1 lod.scratch-MDT0002-mdtlov.pool.pool3.spill_threshold_pct=5
We see the following in dmesg on mds1, not on mds2:
[ 9046.643396] LustreError: 5659:0:(qmt_pool.c:1390:qmt_pool_add_rem()) add to: can't scratch-QMT0000 scratch-OST0000_UUID pool pool1: rc = -17 [ 9056.957864] LustreError: 5666:0:(qmt_pool.c:1390:qmt_pool_add_rem()) add to: can't scratch-QMT0000 scratch-OST0001_UUID pool pool2: rc = -17 [ 9065.980468] LustreError: 5674:0:(qmt_pool.c:1390:qmt_pool_add_rem()) add to: can't scratch-QMT0000 scratch-OST0002_UUID pool pool3: rc = -17
Create files on specific OST pools:
# lfs setstripe -p pool1 -c -1 /lustre/scratch/file1 # lfs getstripe -p /lustre/scratch/file1 pool2 # lfs setstripe -p pool2 -c -1 /lustre/scratch/file2 # lfs getstripe -p /lustre/scratch/file2 pool3 # lfs setstripe -p pool3 -c -1 /lustre/scratch/file3 # lfs getstripe -p /lustre/scratch/file3 pool1
We see the following on MDS0:
[10198.677195] Lustre: 1506:0:(lod_pool.c:799:lod_check_and_spill_pool()) scratch-MDT0000-mdtlov: more than 10 levels of pool spill for 'pool1->pool2' [10223.616652] Lustre: 1506:0:(lod_pool.c:799:lod_check_and_spill_pool()) scratch-MDT0000-mdtlov: more than 10 levels of pool spill for 'pool2->pool3' [10234.693511] Lustre: 1538:0:(lod_pool.c:799:lod_check_and_spill_pool()) scratch-MDT0000-mdtlov: more than 10 levels of pool spill for 'pool3->pool1'