[LU-13076] DNE3: lfs migrate -m should allow -1 as the target index Created: 13/Dec/19 Updated: 04/Dec/21 Resolved: 27/Oct/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Andreas Dilger | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | dne3 | ||
| Issue Links: |
|
||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
When migrating directory trees to new MDTs for space balancing, it would be very convenient to allow specifying "lfs migrate -m -1" to have "lfs" pick the MDT with the most free inodes as the target. This is similar to "lfs setdirstripe -i -1" functionality, but for migration. If multiple directories are specified on the command line, it probably makes sense to refresh the statfs information after each directory tree is migrated, in case there is a new MDT that has more free space. This will greatly simplify directory migration on an existing filesystem without the user having to specify the details. If this is already implemented (it didn't look like parse_targets() would handle I suspect this would be relatively easily implemented in Lustre 2.12, because the mechanism to select the best MDT is already present in lfs. |
| Comments |
| Comment by Lai Siyao [ 13/Dec/19 ] |
|
'lfs migrate -m -1' is not allowed in 2.12, and in the implementation of |
| Comment by Andreas Dilger [ 15/Dec/19 ] |
|
Doesn't it make more sense to have "lfs migrate -m N -c -1" do directly split, or similar? The "-m is used as the DNE stripe index, not the stripe count. |
| Comment by Lai Siyao [ 16/Dec/19 ] |
|
"lfs migrate -m N -c -1" doesn't look meaningful to me: migrate directory to MDT N and with stripe count -1? Maybe we can add a new command "lfs restripe -c -H <dir>" for directory split/merge. BTW let system choose target MDTs may not be optimal in some cases:
|
| Comment by Andreas Dilger [ 17/Dec/19 ] |
|
Ah, I think I understand where my confusion is. Am I correct that you are separating the use of "lfs migrate -m N" to mean "migrate parent directory with all inodes to new MDT 'N'" and "lfs migrate -m N -c -1" to mean the new "split parent directory but leave inodes in place" code that you are working on? I think that "migrate" should generally mean "move inodes to new MDT", while "MDT auto split" should not move existing inodes since that will change inode numbers/locks and could cause problems. I think it is useful to have the meaning of "lfs migrate -m N -c C" for directories be similar to "lfs migrate -i N -c C" for regular files, where "N = -1" means "pick the target index for me" and "C = -1" means "stripe over all targets". Being able to specify "split directory now" is important for testing, but maybe a different command like "lfs setstripe -c <dir>" on the existing directory is the right command for splitting the directory to have a different number of stripes? For stripe_count=1 directories it is possible to add any kind of hash function to the existing directory, so "crush" would be preferred, and then setting the number of stripes would just migrate the names without affecting the inodes? I also agree that a simple implementation of "lfs migrate -m -1" might not pick the best MDTs, but users (other than you and me) are even less likely to pick the best MDTs. That means that the implementation of "-m -1" needs to be smart enough that it picks good MDTs when possible. It should be possible to disable MDTs for new directory creation (similar to "lfs set_param osp.$fsname-OST0000.max_create_count=0" for OSTs) so that the MDS does not put new directories on that MDT. That will allow an MDT to be emptied out with a command like "lfs find -type d -m N | lfs migrate -m -1" without the user having to specify the target for every directory. Also, when the user is doing MDT balancing, the use of "lfs migrate -m -1" needs to move some directories and inodes to the empty MDTs instead of just splitting existing directories onto new MDTs and not moving any inodes, otherwise the full MDT will not be any less full. Maybe this can be done by weighting the current MDT higher than other MDTs but not keep all directories/inodes on the original MDT (e.g. using qos_mdt_prio_free)? |
| Comment by Lai Siyao [ 17/Dec/19 ] |
|
The problem of using "lfs setstripe -c <dir>" (or should be "lfs setdirstripe -c <dir>"?) is this command is currently used to create new striped directories, if it needs to support dir split, it needs to verify whether target directory exists, which will downgrade striped directory creation performance. Do you think it's acceptable? |
| Comment by Lai Siyao [ 17/Dec/19 ] |
|
Besides, to achieve best performance after migration, IMO it should be done in user space by policy engine, which is more flexible and easier to customize. |
| Comment by Andreas Dilger [ 17/Dec/19 ] |
|
You are correct that I meant "setdirstripe -c". It is fine if it "checks" whether the directory exists by trying to create it, then if this fails with -EEXIST or -EISDIR it can try to change the stripe count with a second RPC. I don't think this would impact the performance of "setdirstripe -c", or at least not in any critical way because there will likely be many more RPCs to migrate the entries. I also agree that a user space policy engine might be able to do a more optimal job than the MDS, but it should at least be possible to a basic job of directory migration without a policy engine. |
| Comment by Cliff White (Inactive) [ 05/Mar/21 ] |
|
We currently have a customer with 24 MDTs, they used -c 24 for all their directories and now wish to re-stripe with -c 1. Given the customer lacks experience, I don't want them to have to manually choose a new target with lfs migrate -m - it would be very good to have -m -1, we need to have a hands-free restriping here. |
| Comment by Gerrit Updater [ 10/Sep/21 ] |
|
"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44886 |
| Comment by Gerrit Updater [ 27/Oct/21 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44886/ |
| Comment by Peter Jones [ 27/Oct/21 ] |
|
Landed for 2.15 |