Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.16.0

    Description

      For isolation of workloads across multiple sub-tenants of a filesystem, it would be useful to allow registering an NRS TBF rule for a nodemap. This can be proxied to some extent by setting a TBF rule for a project ID, but this doesn't work if there are multiple project IDs used by a single nodemap.

      Attachments

        Issue Links

          Activity

            [LU-17902] add NRS TBF policy for nodemap
            qian_wc Qian Yingjin added a comment -

            I will rebase it later.

            For your question, 

            To make NRS TBF so smart to detect the node map changing (adding and removing), maybe we should both name and id of node map into the key of a TBF class. Each matching will compare both name and id of the node map? In the current version and previous version, we only compare one of them (either name, or id), not both.

            And the adding/removing node map will change the id, I think. Thus NRS TBF can detect the change during the matching.

            However, this means more memory used for each TBF class.

            qian_wc Qian Yingjin added a comment - I will rebase it later. For your question,  To make NRS TBF so smart to detect the node map changing (adding and removing), maybe we should both name and id of node map into the key of a TBF class. Each matching will compare both name and id of the node map? In the current version and previous version, we only compare one of them (either name, or id), not both. And the adding/removing node map will change the id, I think. Thus NRS TBF can detect the change during the matching. However, this means more memory used for each TBF class.

            "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57254
            Subject: LU-17902 nrs: add NRS TBF policy for nodemap
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: bae391a8d8a698c7b4fc7d193785b5ef55cb7af9

            gerrit Gerrit Updater added a comment - "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57254 Subject: LU-17902 nrs: add NRS TBF policy for nodemap Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: bae391a8d8a698c7b4fc7d193785b5ef55cb7af9

            I have a question for node map TBF support:
            Should we use @lu_nodemap.nm_id or @lu_nodemap.nm_name as the key for TBF scheduling class?
            I think both can be used as key to identify a nodemap.

            Good question. nm_id is unique and could be used as the key. However, if a nodemap is removed and then recreated with the same name, its id will be different, but maybe admins would expect it to be treated as before? In this case, it would make sense to use nm_name as the key for TBF: even if the nodemap is recreated, the same TBF rules apply, and do not need to be fixed.

            sebastien Sebastien Buisson added a comment - I have a question for node map TBF support: Should we use @lu_nodemap.nm_id or @lu_nodemap.nm_name as the key for TBF scheduling class? I think both can be used as key to identify a nodemap. Good question. nm_id is unique and could be used as the key. However, if a nodemap is removed and then recreated with the same name, its id will be different, but maybe admins would expect it to be treated as before? In this case, it would make sense to use nm_name as the key for TBF: even if the nodemap is recreated, the same TBF rules apply, and do not need to be fixed.
            qian_wc Qian Yingjin added a comment -

            improved "fair share" balancing between buckets

            What's the "faire share" meaning?

             

            I have a question for node map TBF support:

            Should we use @lu_nodemap.nm_id or @lu_nodemap.nm_name as the key for TBF scheduling class?

            I think both can be used as key to identify a nodemap. 

            qian_wc Qian Yingjin added a comment - improved "fair share" balancing between buckets What's the "faire share" meaning?   I have a question for node map TBF support: Should we use @lu_nodemap.nm_id or @lu_nodemap.nm_name as the key for TBF scheduling class? I think both can be used as key to identify a nodemap. 
            qian_wc Qian Yingjin added a comment -

            Please note that a Lustre client should not have rate control with two different TBF class:

            See LU-7982 for details:

            When using JOBD-based TBF rules, if multiple jobs run on the same client, the RPC rates of those jobs will be affected by each other. More precisely, the job that has high RPC rate limitation might get slow RPC rate actually. The reason of that is, the job that has slower RPC rate limitations might exaust the max-in-flight-RPC-number limitation, or the max-cache-pages limitation.

            qian_wc Qian Yingjin added a comment - Please note that a Lustre client should not have rate control with two different TBF class: See LU-7982 for details: When using JOBD-based TBF rules, if multiple jobs run on the same client, the RPC rates of those jobs will be affected by each other. More precisely, the job that has high RPC rate limitation might get slow RPC rate actually. The reason of that is, the job that has slower RPC rate limitations might exaust the max-in-flight-RPC-number limitation, or the max-cache-pages limitation.
            qian_wc Qian Yingjin added a comment -

            I am currently working on TBF for project ID.

             If " This can be proxied to some extent by setting a TBF rule for a project ID" is the direction, then we can borrow the implementation of PCC code for aggregate project ID range and all requests in these project ID range have a TBF shared rate.

            We have a patch for PCC implementing the similar functionality:

            LU-13881 pcc: comparator support for PCC rules

            https://review.whamcloud.com/c/fs/lustre-release/+/39585

            TBF rule can borrow the code from PCC to define a rule with a range of PROJID values with '<' or '>' comparator.

             

            We also have a patch for aggregate shared rate limiting for TBF:

            https://review.whamcloud.com/#/c/fs/lustre-release/+/56351/

             

            i.e. 

            lctl set_param ost.OSS.ost_io.nrs_tbf_rule="start sharerate projid>{100}&projid<{1000} rate=3000 share=1".

             

            If we need TBF supporting node map directly, I need some time to learn the background knowledge about node map.

             

             

            qian_wc Qian Yingjin added a comment - I am currently working on TBF for project ID.  If " This can be proxied to some extent by setting a TBF rule for a project ID" is the direction, then we can borrow the implementation of PCC code for aggregate project ID range and all requests in these project ID range have a TBF shared rate. We have a patch for PCC implementing the similar functionality: LU-13881 pcc: comparator support for PCC rules https://review.whamcloud.com/c/fs/lustre-release/+/39585 TBF rule can borrow the code from PCC to define a rule with a range of PROJID values with '<' or '>' comparator.   We also have a patch for aggregate shared rate limiting for TBF: https://review.whamcloud.com/#/c/fs/lustre-release/+/56351/   i.e.  lctl set_param ost.OSS.ost_io.nrs_tbf_rule="start sharerate projid>{100}&projid<{1000} rate=3000 share=1".   If we need TBF supporting node map directly, I need some time to learn the background knowledge about node map.    

            People

              qian_wc Qian Yingjin
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: