[LU-16678] QOS improvement Created: 28/Mar/23 Updated: 28/Mar/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Story | Priority: | Minor |
| Reporter: | Sergey Cheremencev | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
From Andreas comment in
Patch the weight and penalty calculation to reduce/exclude the blocks or inodes, depending on which one is currently "unimportant". For example, on OSTs there are typically far more free inodes than space, so the free inodes should not affect the result when calculating the weight. Conversely, on the MDTs there is usually more free space than inodes, so the free space should not affect the weight. However, in some situations (e.g. DoM or Changelogs filling MDT space, or very small objects on OSTs) these values may become important and cannot be ignored completely as in my 49890 patch. We cannot change the weight calculation to selectively add/remove the inodes/blocks completely, since that will change the "units" they are calculated in, and it may be more or less important for different OSTs depending on their free usage. I was thinking something along the following lines:
For example, the inode weight could be limited to ia = min(2 * bytes_avail / cur_bpi, inodes_free) >> 8 and the bytes weight should be limited to ba = min(2 * inodes_free * cur_bpi, bytes_avail) >> 16 (possibly with other scaling factors depending on OST count/size). These values represent how many inodes or bytes can expect to be allocated by new objects based on the historical average bytes-per-inode usage of the filesystem. If a target has mostly large objects, then cur_bpi would be large, so ia would be limited by the 2 * bytes_avail / cur_bpi part and it doesn't matter how many actually free inodes there are. Conversely, if cur_bpi is small (below tot_bpi means that the inodes would run out first) then 2 * bytes_avail / cur_bpi would be large and inodes_free would be the limiting factor for allocations. In the middle, if the average object size is close to the mkfs limits, then both the free inodes and bytes would be taken into account. |