[LU-17182] lctl pool_add is slow when using individual OST Created: 11/Oct/23 Updated: 22/Jan/24 Resolved: 25/Oct/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Raphael Druon | Assignee: | Feng Lei |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
UsingĀ lctl pool_add FS OST0 OST1... is way slower than using an hostlist expression like OST[0-7], see an example below: [root@localhost ~]# time lctl pool_add fs.pool_test fs-OST[0-7] OST fs-OST0000_UUID added to pool fs.pool_test OST fs-OST0001_UUID added to pool fs.pool_test OST fs-OST0002_UUID added to pool fs.pool_test OST fs-OST0003_UUID added to pool fs.pool_test OST fs-OST0004_UUID added to pool fs.pool_test OST fs-OST0005_UUID added to pool fs.pool_test OST fs-OST0006_UUID added to pool fs.pool_test OST fs-OST0007_UUID added to pool fs.pool_test real 0m9.008s user 0m0.000s sys 0m0.007s [root@localhost ~]# time lctl pool_add fs.pool_test OST0000 OST0001 OST0002 OST0003 OST0004 OST0005 OST0006 OST0007 OST fs-OST0000_UUID added to pool fs.pool_test OST fs-OST0001_UUID added to pool fs.pool_test OST fs-OST0002_UUID added to pool fs.pool_test OST fs-OST0003_UUID added to pool fs.pool_test OST fs-OST0004_UUID added to pool fs.pool_test OST fs-OST0005_UUID added to pool fs.pool_test OST fs-OST0006_UUID added to pool fs.pool_test OST fs-OST0007_UUID added to pool fs.pool_test real 1m7.024s user 0m0.004s sys 0m0.014s One could expect that both command to have the same runtime |
| Comments |
| Comment by Peter Jones [ 11/Oct/23 ] |
|
Feng Lei Could you please investigate? Thanks Peter |
| Comment by Andreas Dilger [ 11/Oct/23 ] |
|
Yes, this is kind of a "known" problem - the "lctl pool*" commands are all waiting on the update of the pool status to arrive at the client, which is asynchronous and takes several seconds to complete. You can see that with the individual OST list that it takes 67s, which is almost exactly 8x 9s taken to do all of them at once. |
| Comment by Joe Grund [ 11/Oct/23 ] |
|
adilger Possible to treat the second version (individual OST lists) the same as the first (indexset)? I.E. both versions run in parallel? |
| Comment by Andreas Dilger [ 11/Oct/23 ] |
|
I was going to say that the main problem is that the "add one at a time" case doesn't know whether there will be later commands run or not. They are all treated separately, and the command explicitly waits for the pool layout to be updated. That was added because the pool command would return without the actual update, and give the admin a false impression that the pool had been successfully updated, when it wasn't always the case. One option would be to add an "--async" option to the pool commands that skips calling check_pool_cmd_result(), so that if you know a number of them will be executed (e.g. from EMF) that the preliminary ones are executed without waiting (and have a low chance of being done incorrectly by a user), and EMF can check the results afterward. However, I now see that the second case is not "lctl pool_add ...; lctl pool_add ...; ..." but rather a single execution with multiple OSTs on the command-line. It should be possible to handle this with a single check at the end. It looks like jt_pool_cmd() would need to build up the OST list and execute all of the adds (removes) at once, instead of calling check_pool_cmd_result() for each argument separately. |
| Comment by Gerrit Updater [ 12/Oct/23 ] |
|
"Feng Lei <flei@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52654 |
| Comment by Gerrit Updater [ 25/Oct/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52654/ |
| Comment by Peter Jones [ 25/Oct/23 ] |
|
Landed for 2.16 |