[LU-9228] Hard TBF Token Compensation under congestion Created: 20/Mar/17  Updated: 26/Mar/18  Resolved: 25/Jan/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.11.0

Type: New Feature Priority: Minor
Reporter: Qian Yingjin (Inactive) Assignee: Qian Yingjin (Inactive)
Resolution: Fixed Votes: 0
Labels: patch

Issue Links:
Related
is related to LUDOC-328 documentation updates for complex TBF... Open
is related to LU-3558 NRS TBF policy for QoS purposes Resolved
Rank (Obsolete): 9223372036854775807

 Description   

During TBF evaluation, we find that when the sum of I/O bandwidth requirements for all classes exceeds the system capacity, the classes with same rate limits get less bandwidth than preconfigured evenly.
The reason is as follow. Under heavy load on a congested server, it will result in some missed deadlines for some classes. The calculated tokens may larger than 1 during dequeuing. In the original implementation, all classes are equally handled to simply discard exceeding tokens.
Thus, a Hard Token Compensation (HTC) strategy is proposed. A class can be configured with HTC feature by the rule it matches. This feature means that requests in this kind of class queues have high real-time requirements and that the bandwidth assignment must be satisfied as good as possible. When deadline misses happen, the class keeps the deadline unchanged and the time residue (the remainder of elapsed time divided by 1/r) is compensated to the next round. This ensures that the next idle I/O thread will always select this class to serve until all accumulated exceeding tokens are handled or there are no pending requests in the class queue.

A new command format is added to enable HTC feature for a rule:

start $ruleName jobid={dd.0} rate=100 realtime=1




 Comments   
Comment by Gerrit Updater [ 20/Mar/17 ]

Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/26087
Subject: LU-9228 nrs: Hard Token Compensation under congestion
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6860deca4bd4e5efcc5a68b44beb4eb9bd7df95b

Comment by Gerrit Updater [ 25/Jan/18 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/26087/
Subject: LU-9228 nrs: TBF realtime policies under congestion
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d11fa2c279593634cf6c4196b413a6d285b24e10

Comment by Peter Jones [ 25/Jan/18 ]

Landed for 2.11

Comment by Emoly Liu [ 13/Mar/18 ]

The documentation update of HTC strategy is at https://review.whamcloud.com/31628 .

Generated at Sat Feb 10 02:24:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.