[LU-17044] TBF: minimum guarantee RPC rate when the serveur is overloaded Created: 21/Aug/23  Updated: 17/Nov/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Etienne Aujames Assignee: Etienne Aujames
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-17296 NRS TBF default rules Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This seems to be something wanted from a long time: LU-8433
The idea here is to be able to configure a rule like this:

# lctl set_param ...=start fio jobid={fio.*} minrate=1000 rate=10000
# lctl get_param ...
regular requests:
CPT 0:
fio {fio.*} 1000-10000, ref 2
...

The rate for each job matching "fio.*" should be between 1000-10000 RPC/s.
So requests in queue belonging to a class bucket with rate below minimum acceptable rate should be scheduled first.

Implementation:
The main idea is to duplicate some part of TBF code to implement deadline and token for minimum rate.
The compare function (tbf_cli_compare()) of the TBF binheap can be modify to firstly compare the deadlines of the minimum rate but only if the class has missed some token time slots before (current rate < minimum rate).
That way, if a class is late (< minrate), it should be scheduled before the others class. If the class is not late anymore, it will be scheduled according its maximum rate.

Realtime
The realtime rule is used to prioritize the classes matching the rule when server loaded (if the sum of all the defined rate is above what the server can handle). The realtime classes will always try to match the specified rate by degrading the performances of the other classes.
When a minimum rate is specified on a realtime rule, it makes sense to apply the "realtime" behavior on the lower rate limit of the classe.



 Comments   
Comment by Gerrit Updater [ 22/Aug/23 ]

"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52038
Subject: LU-17044 tbf: implement minrate for tbf rules
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 07bdc9c2cdad6937cfc9839926c49d0186901a52

Generated at Sat Feb 10 03:32:09 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.