[LU-8433] Maximizing Bandwidth utilization by TBF Rule with Dependency Created: 24/Jul/16  Updated: 05/Feb/24

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Minor
Reporter: Qian Yingjin (Inactive) Assignee: Qian Yingjin
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-17503 IO500: improve NRS TBF to sort reques... Open
is related to LU-17296 NRS TBF default rules Open
is related to LUDOC-328 documentation updates for complex TBF... Open
Epic/Theme: patch
Rank (Obsolete): 9223372036854775807

 Description   

The TBF policy is not aimed to provide improved performance, it achieves rate limit through I/O throttle. When I/O is throttling, it could not make full use of system resources even there are some ideal I/O service threads and spare disk I/O bandwidth. But for some use cases, it needs to have the ability to allocate spare capacity to the workloads or background jobs. In order to ensure efficient utilization of I/O resource, we propose a dependency rule strategy. The command of a dependency rule is shown as following:

start ruleB <matchCondition> deprule=ruleA lowerrate=$r1 upperrate=$r2

Where deprule represents the rule name of the dependent rule, which means, 'ruleB' depends on 'ruleA' ; the key 'lowerrate' indicates the lower bound of RPC rate limited value while the key 'upperrate' indicates the upper bound of RPC rate limited value. The principle is that the real RPC rate limited value of a rule is dynamically adjusted between the lowerrate and upperrate to obtain more I/O bandwidth according to the spare I/O capacity that its dependent rule does not make full use of.



 Comments   
Comment by Peter Jones [ 24/Jul/16 ]

Is this something that you are working on yourself?

Comment by Li Xi (Inactive) [ 25/Jul/16 ]

Hi Peter,

Yeah. Yingjin and me are working on this for a long time. And as far as we've tested, the patches work well except
a few corner conditions. As soon as we finished fixing the defects, we will push the patches to community. And
code review would be much appreciated

Regards,
Li Xi

Comment by Evan D. Chen (Inactive) [ 25/Jul/16 ]

We think it is more of a new feature than improvement. Change to new feature.

Comment by Andreas Dilger [ 25/Jul/16 ]

It isn't clear to me why the current TBF implementation will restrict performance if there are idle resources? Is the current TBF priority a relative weighting (i.e. process RPCs matching ruleA N times more often than RPCs matching ruleB) or is it an actual limit on the number of RPCs (i.e. process only N RPCs matching ruleA per second)? If it is a relative weighting then the threads should always be kept busy, either because RPCs matching ruleA are available, or because none are available and RPCs matching ruleB are available.

What would also be useful is if TBF included the functionality from ORR to do request ordering within a given class to optimize IO submission to disk. That could allow TBF to improve performance instead of just reducing it. The alternative is to allow stacking NRS policies so that ORR is the secondary ordering for TBF, but I don't know if that will provide as good performance as having a single combined policy.

Comment by Li Xi (Inactive) [ 26/Jul/16 ]

Hmm, I think the tittle of "Maximizing Bandwidth" is a little bit inaccurate. It is not necessarily true that the dependency rules will maximize the bandwidth comparing to the original TBF policy. At least the puporse of implementing of dependency rules is not maximizing the bandwidth. Instead, dependency rules enable different priority levels for different NIDs/JobIDs. Assume that we have a job 0 with high priority and another job 1 with low priority. We could set a rule A to match the job 0, and rule B which matches job 1 and depends on rule A. Ideally, the modified TBF policy would always provide as much RPC rate as possible to job 0. If job 0 is not using up the whole bandwidth on the OSS, job 1 could increase its rate limitation. And if job 0 is not getting the expected RPC rate of rule A, the RPC rate of job 1 should be decreased. So, as you can see, this is advanced QoS, not maximizing bandwidth.

Yeah, I would expect that combing ORR with TBF might be able to improve the throughput in total.

Comment by Qian Yingjin (Inactive) [ 26/Jul/16 ]

Our current TBF policy is an actual limit on the number of RPCs.
There is a paper named "mClock: Handling Throughput Variability for Hypervisor IO Scheduling" (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.182.4720) , I think it can solve the question you concerned above. According to the paper, it has following features: Proportional allocation, Latency support, Reservation support, Limit support and Handle capacity fluctuation, etc.
mClock assigns tags spaced by increments of 1/rate to succes-sive requests of a VM. If all requests are scheduled in order of their tag values, the VM will receive service in proportion to rate. And mClock extend the notion to use multiple tags to support proportional-share fairness subject to minimum reservations and maximum limits on the IO allocations for VMs. Our algorithm also uses the notion similar with tag-based scheduling to achieve rate control, but we set the deadline (tags) based on the class not based on each request, the sort set according to the deadline is much less, and all classes are selected in order of their deadline and then schedule the requests in the class queue in FCFS order. As the general intuitive ideas are similar, we think it is possible to integrated mClock algorithm into our algorithm, but we still need to investigate.

Comment by Gerrit Updater [ 24/Dec/16 ]

Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/24515
Subject: LU-8433 nrs: Maximizing throughput via rules with dependency
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2de06fce6b484239a9dc9491557bd76177386bab

Comment by Andreas Dilger [ 19/Jan/17 ]

To my reading, the specification of dependent rules will be complex and not easily handled by users. Instead, it seems like TBF could have "soft" scheduling of RPCs using the existing rules, and just not enforce RPC limits on classes if there are no RPCs of some higher priority in the queue. In essence (I think) a class with outstanding RPCs would continue to gain tokens (in proportion with the rate of lower-priority rules) while it has queued RPCs and higher priorities do not. Such rules could add a new keyword to specify rate_soft= or similar (since the current rate= is considered a hard limit today).

Generated at Sat Feb 10 02:17:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.