Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9228

Hard TBF Token Compensation under congestion

Details

    • New Feature
    • Resolution: Fixed
    • Minor
    • Lustre 2.11.0
    • None
    • 9223372036854775807

    Description

      During TBF evaluation, we find that when the sum of I/O bandwidth requirements for all classes exceeds the system capacity, the classes with same rate limits get less bandwidth than preconfigured evenly.
      The reason is as follow. Under heavy load on a congested server, it will result in some missed deadlines for some classes. The calculated tokens may larger than 1 during dequeuing. In the original implementation, all classes are equally handled to simply discard exceeding tokens.
      Thus, a Hard Token Compensation (HTC) strategy is proposed. A class can be configured with HTC feature by the rule it matches. This feature means that requests in this kind of class queues have high real-time requirements and that the bandwidth assignment must be satisfied as good as possible. When deadline misses happen, the class keeps the deadline unchanged and the time residue (the remainder of elapsed time divided by 1/r) is compensated to the next round. This ensures that the next idle I/O thread will always select this class to serve until all accumulated exceeding tokens are handled or there are no pending requests in the class queue.

      A new command format is added to enable HTC feature for a rule:

      start $ruleName jobid={dd.0} rate=100 realtime=1
      
      
      

      Attachments

        Issue Links

          Activity

            [LU-9228] Hard TBF Token Compensation under congestion
            emoly.liu Emoly Liu made changes -
            Link New: This issue is related to LUDOC-328 [ LUDOC-328 ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.11.0 [ 13091 ]
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            mdiep Minh Diep made changes -
            Labels New: patch
            qian Qian Yingjin (Inactive) made changes -
            Description Original: During TBF evaluation, we find that when the sum of I/O bandwidth requirements for all classes exceeds the system capacity, the classes with same rate limits get less bandwidth than preconfigured evenly.
             The reason is as follow. Under heavy load on a congested server, it will result in some missed deadlines for some classes. The calculated tokens may larger than 1 during dequeuing. In the original implementation, all classes are equally handled to simply discard exceeding tokens.
             Thus, a Hard Token Compensation (HTC) strategy is proposed. A class can be configured with HTC feature by the rule it matches. This feature means that requests in this kind of class queues have high real-time requirements and that the bandwidth assignment must be satisfied as good as possible. When deadline misses happen, the class keeps the deadline unchanged and the time residue (the remainder of elapsed time divided by 1/r) is compensated to the next round. This ensures that the next idle I/O thread will always select this class to serve until all accumulated exceeding tokens are handled or there are no pending requests in the class queue.

            A new command format is added to enable HTC feature for a rule:
            {code:java}
            start $ruleName jobid={dd.0} rate=100 compensate=1

            {code}
            New: During TBF evaluation, we find that when the sum of I/O bandwidth requirements for all classes exceeds the system capacity, the classes with same rate limits get less bandwidth than preconfigured evenly.
             The reason is as follow. Under heavy load on a congested server, it will result in some missed deadlines for some classes. The calculated tokens may larger than 1 during dequeuing. In the original implementation, all classes are equally handled to simply discard exceeding tokens.
             Thus, a Hard Token Compensation (HTC) strategy is proposed. A class can be configured with HTC feature by the rule it matches. This feature means that requests in this kind of class queues have high real-time requirements and that the bandwidth assignment must be satisfied as good as possible. When deadline misses happen, the class keeps the deadline unchanged and the time residue (the remainder of elapsed time divided by 1/r) is compensated to the next round. This ensures that the next idle I/O thread will always select this class to serve until all accumulated exceeding tokens are handled or there are no pending requests in the class queue.

            A new command format is added to enable HTC feature for a rule:
            {code:java}
            start $ruleName jobid={dd.0} rate=100 realtime=1


            {code}
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-3558 [ LU-3558 ]
            qian Qian Yingjin (Inactive) made changes -
            Description Original: During TBF evaluation, we find that when the sum of I/O bandwidth requirements for all classes exceeds the system capacity, the classes with same rate limits get less bandwidth than preconfigured evenly.
            The reason is as follow. Under heavy load on a congested server, it will result in some missed deadlines for some classes. The calculated tokens may larger than 1 during dequeuing. In the original implementation, all classes are equally handled to simply discard
            exceeding tokens.
            Thus, a Hard Token Compensation (HTC) strategy is proposed. A class can be configured with HTC feature by the rule it matches. This feature means that requests in this kind of class queues have high real-time requirements and that the bandwidth assignment must be satisfied as good as possible. When deadline misses happen, the class keeps the deadline unchanged and the time residue (the remainder of elapsed time divided by 1/r) is compensated to the next round. This ensures that the next idle I/O thread will always select this class to serve until all accumulated exceeding tokens are handled or there are no pending requests in the class queue.

            A new command format is added to enable HTC feature for a rule:
            {code:java}
            start $ruleName jobid={dd.0} rate=100 compensate=1
            {code}
            New: During TBF evaluation, we find that when the sum of I/O bandwidth requirements for all classes exceeds the system capacity, the classes with same rate limits get less bandwidth than preconfigured evenly.
             The reason is as follow. Under heavy load on a congested server, it will result in some missed deadlines for some classes. The calculated tokens may larger than 1 during dequeuing. In the original implementation, all classes are equally handled to simply discard exceeding tokens.
             Thus, a Hard Token Compensation (HTC) strategy is proposed. A class can be configured with HTC feature by the rule it matches. This feature means that requests in this kind of class queues have high real-time requirements and that the bandwidth assignment must be satisfied as good as possible. When deadline misses happen, the class keeps the deadline unchanged and the time residue (the remainder of elapsed time divided by 1/r) is compensated to the next round. This ensures that the next idle I/O thread will always select this class to serve until all accumulated exceeding tokens are handled or there are no pending requests in the class queue.

            A new command format is added to enable HTC feature for a rule:
            {code:java}
            start $ruleName jobid={dd.0} rate=100 compensate=1

            {code}
            qian Qian Yingjin (Inactive) created issue -

            People

              qian Qian Yingjin (Inactive)
              qian Qian Yingjin (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: