[LU-8417] setstripe -o does not work on directories Created: 20/Jul/16 Updated: 30/Nov/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major |
| Reporter: | Gary Hagensen (Inactive) | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | easy | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
the -o option to setstripe where you can specify the OSTs to use works on files but gets an error when you do the same thing on directories. [root@Lustre-TG1 lustrefs]# mkdir testdir
[root@Lustre-TG1 lustrefs]# lfs setstripe -o 0-3 -c 4 testfile
[root@Lustre-TG1 lustrefs]# lfs getstripe testfile
testfile
lmm_stripe_count: 4
lmm_stripe_size: 1048576
lmm_pattern: 1
lmm_layout_gen: 0
lmm_stripe_offset: 0
obdidx objid objid group
0 9925 0x26c5 0
1 1606 0x646 0
2 1608 0x648 0
3 1609 0x649 0
[root@Lustre-TG1 lustrefs]# lfs setstripe -o 0-3 -c 4 testdir
error on ioctl 0x4008669a for 'testdir' (3): Invalid argument
error: setstripe: create stripe file 'testdir' failed
|
| Comments |
| Comment by Andreas Dilger [ 20/Jul/16 ] |
|
This would need to store the full uninitialized LOV EA on the MDS directory as the template for the file layout. Based on discussion with Gary, this is needed for testing OSS performance. For "real world" usage, it would probably be a "poor man's OST pools" in the sense that it would be possible to specify a list of OSTs, and then a stripe count less than the total OST count, and it should be possible to select a subset of OSTs to create files on. |
| Comment by Andreas Dilger [ 20/Jul/16 ] |
|
One issue with using "-o ... -c" to implement "temporary pools" is that there is no place (currently) to store the object allocation state across file creates as there is with a proper pool, so it can't be smart about round-robin allocation (e.g. 0+1, 2+3, 4+5, 6+7, 1+2, 3+4, ... to track the last OST index used and to avoid having stripe 0 on the same OST repeatedly). The MDS would somehow need to dynamically allocate an internal allocation state based on the specified OST list and then keep that in memory for some time to handle allocations with the same group of OSTs. It is OK if the same OST list is shared by different directories, since the file creation and IO load on the OSTs is also shared. There is no significant issue if the allocation state is dropped after some idle time (e.g. a couple of minutes), since the load on the OSTs is also transient. Ideally this would be implemented as part of LU-9 so that the imbalance of file creates on a subset of OSTs does not cause global imbalance across other OSTs. |
| Comment by Andreas Dilger [ 20/Jul/16 ] |
|
Link to original |
| Comment by Andreas Dilger [ 07/Nov/17 ] |
|
It looks like https://review.whamcloud.com/12275 implements this to some extent. |
| Comment by Andreas Dilger [ 30/Nov/23 ] |
|
It looks like specifying explicit OSTs is working since at least 2.14: # lfs setstripe -o 2,3,2,3 /mnt/testfs/specific
# touch /mnt/testfs/specific/fff
# lfs getstripe --yaml /mnt/testfs/specific
stripe_count: 4
stripe_size: 1048576
pattern: raid0,overstriped
stripe_offset: 2
lmm_stripe_count: 4
lmm_stripe_size: 1048576
lmm_pattern: raid0,overstriped
lmm_layout_gen: 0
lmm_stripe_offset: 2
lmm_objects:
- l_ost_idx: 2
l_fid: 0x380000401:0x79:0x0
- l_ost_idx: 3
l_fid: 0x3c0000401:0x79:0x0
- l_ost_idx: 2
l_fid: 0x380000401:0x7a:0x0
- l_ost_idx: 3
l_fid: 0x3c0000401:0x7a:0x0
The one remaining issue is that "lfs getstripe" on the directory does not print the specific layout properly. It should print the specific OST indices and not just the stripe count. |