[LU-16522] "lfs setstripe -i N" with deactivated OST(s) always picks next active OST Created: 02/Feb/23 Updated: 08/Feb/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0, Lustre 2.16.0, Lustre 2.15.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Andreas Dilger | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
When OSTs are disabled with osp.*.max_create_count=0 and files are created using a specific file layout (i.e. one that explicitly selects the starting OST index) for an OST that is inactive, this will cause the MDS object allocator to always select the first available OST after the disabled ones. For example, on an 8-OST filesystem where OST0000-OST0003 are all disabled, trying to create files explicitly on any of those OSTs will always result in the objects being allocated from OST0004. # lctl set_param osp.testfs-OST000[0-3]*.max_create_count=0
osp.testfs-OST0000-osc-MDT0000.max_create_count=0
osp.testfs-OST0001-osc-MDT0000.max_create_count=0
osp.testfs-OST0002-osc-MDT0000.max_create_count=0
osp.testfs-OST0003-osc-MDT0000.max_create_count=0
# for O in {0..3}; do lfs setstripe -i $O /mnt/testfs/ost$O; done
# lfs getstripe -i /mnt/testfs/ost* | sort | uniq -c
4
4 4
This issue does not affect "normal" file creations that do not specify the starting OST index of files: # touch /mnt/testfs/file{0..99}
# lfs getstripe -i /mnt/testfs/file{0..99} | sort | uniq -c
100
25 4
25 5
25 6
25 7
It does affect "lfs migrate" without any given layout due to it copying the layout from the files that includes the starting index ( # lfs getstripe -i /mnt/testfs/old* | sort | uniq -c
100
13 0
13 1
13 2
12 3
12 4
12 5
12 6
13 7
# lfs migrate /mnt/testfs/old*
# lfs getstripe -i /mnt/testfs/old* | sort | uniq -c
100
55 4
15 5
15 6
15 7
If a large number of OSTs were disabled while "lfs migrate" is used without any arguments (copying the specific layout from the source file prior to patch https://review.whamcloud.com/49865 " # lfs getstripe -i /mnt/testfs/old* | sort | uniq -c
100
13 0
13 1
12 2
12 3
12 4
12 5
13 6
13 7
# lfs migrate -c 1 /mnt/testfs/old*
# lfs getstripe -i /mnt/testfs/old* | sort | uniq -c
100
25 4
25 5
25 6
25 7
Fixing "lfs migrate" to reset the OST index in the source layout avoids the problem in this case, but it would also be worthwhile to also fix the problem in the MDS LOD OST selection code, so that other tools which provide specific layouts via saved/copied xattrs (e.g. "tar", and maybe "rsync" or "cp" in the future) will not encounter the same problem. If the client explicitly requests an OST index that is inactive or disabled, the MDS should pick a random or weighted OST index (possibly within the same OST pool) rather than just picking the next available OST index. |