[LU-15850] MDT QOS should always be used for round-robin directories. Created: 12/May/22 Updated: 28/Oct/22 Resolved: 05/Aug/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.15.0 |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Andreas Dilger | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | MON | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||
| Description |
|
The MDT QOS should always be used for subdirectories created in a parent that has round-robin activated, if the MDT space balance exceeds qos_threshold_rr. Otherwise, subdirectories in that directory tree will suddenly change from r-r to being "sticky" on a single MDT, which significantly changed the behavior and load distribution across MDTs. The "threshold by depth" should only be used for directories that would otherwise have always been created on the parent already. Related to this, it should be possible to tune the weighting of subdirectories by depth so that this can be adjusted without recompiling the code. |
| Comments |
| Comment by Andreas Dilger [ 19/May/22 ] |
|
Lai, I was trying to test my patch to fix the "use space balance for RR directories" issue, but found something very wrong with the max-inherit and max-inherit-rr code, when used with explicitly inherited default layouts (i.e. layouts set on a non-root directory and copied down the tree while decrementing max-inherit-rr). When used with the implicitly inherited layout from the root directory, the max-inherit-rr value is copied from the ROOT directory, and is compared against lli_dir_depth (which increases with directory depth):
if (lsm->lsm_md_max_inherit_rr != LMV_INHERIT_RR_NONE &&
(lsm->lsm_md_max_inherit_rr == LMV_INHERIT_RR_UNLIMITED ||
lsm->lsm_md_max_inherit_rr >= lli->lli_dir_depth))
op_data->op_flags |= MF_RR_MKDIR;
This works as expected because lsm_md_max_inherit_rr is constant when implicitly inherited from the root directory, and lli_dir_depth is increasing by directory depth. However, if lsm_md_max_inherit_rr is on an explicitly copied default layout on a directory, then lsm_md_max_inherit_rr is decremented by one for each level FROM THE FILESYSTEM ROOT, while lli_dir_depth is incremented by one for each level. So in this second case, these values can be totally unrelated and the comparison is meaningless. For example, if the directory is 10 deep from the filesystem root, then lli_dir_depth must be >= 10 on the parent directory. I'm thinking something like "store (lsm_md_max_inherit_rr + parent->lli_dir_depth) in memory on the parent directory, so that the child directory (with (child->lli_dir_depth = parent->lli_dir_depth + 1), so that the "parent->lli_dir_depth" value cancels out and the above check works properly. However, it doesn't seem very obvious yet how that will be implemented properly. |
| Comment by Lai Siyao [ 23/May/22 ] |
|
Indeed, dir-depth only considered ROOT. We may convert lsm_md_max-inherit and lsm_md_max-inherit-rr to absolute dir depth to ROOT in lsm unpack, then the comparison with lli_dir_depth will be opaque. |
| Comment by Lai Siyao [ 27/May/22 ] |
|
Andreas, below code is for filesystem-wide default LMV only: if (lsm->lsm_md_max_inherit_rr != LMV_INHERIT_RR_NONE &&
(lsm->lsm_md_max_inherit_rr == LMV_INHERIT_RR_UNLIMITED ||
lsm->lsm_md_max_inherit_rr >= lli->lli_dir_depth))
op_data->op_flags |= MF_RR_MKDIR;
This looks not to be an issue. |
| Comment by Andreas Dilger [ 27/May/22 ] |
|
I was trying to check MF_RR_MKDIR to see if the directory has round-robin allocation enabled, so that the "stay on parent" check in lmv_locate_tgt_qos() would not be used if the parent is r-r. |
| Comment by Gerrit Updater [ 09/Jun/22 ] |
|
"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47576 |
| Comment by Gerrit Updater [ 09/Jun/22 ] |
|
"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47577 |
| Comment by Gerrit Updater [ 09/Jun/22 ] |
|
"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47578 |
| Comment by Gerrit Updater [ 20/Jun/22 ] |
|
"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47679 |
| Comment by Gerrit Updater [ 27/Jun/22 ] |
|
"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47789 |
| Comment by Gerrit Updater [ 26/Jul/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47576/ |
| Comment by Gerrit Updater [ 03/Aug/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47577/ |
| Comment by Gerrit Updater [ 05/Aug/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47578/ |
| Comment by Peter Jones [ 05/Aug/22 ] |
|
Landed for 2.16 |