Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.16.0
    • 3
    • 9223372036854775807

    Description

      It appears that there was no rule added for NRS TBF to allow selecting the project ID for an RPC.

      Attachments

        Issue Links

          Activity

            [LU-17166] add NRS TBF rule for projid

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/54920/
            Subject: LU-17166 ptlrpc: add pb_projid field in ptlrpc_body
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 2cde8f434a86b8499dfbe56d59fe93a028c229ee

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/54920/ Subject: LU-17166 ptlrpc: add pb_projid field in ptlrpc_body Project: fs/lustre-release Branch: master Current Patch Set: Commit: 2cde8f434a86b8499dfbe56d59fe93a028c229ee

            "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57245
            Subject: LU-17166 nrs: add NRS TBF rule for ProjID support
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: be558146346fa11732260376a32542fd451c1b7a

            gerrit Gerrit Updater added a comment - "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57245 Subject: LU-17166 nrs: add NRS TBF rule for ProjID support Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: be558146346fa11732260376a32542fd451c1b7a
            qian_wc Qian Yingjin added a comment -

            We have a patch for PCC implementing the similar functionality:

            LU-13881 pcc: comparator support for PCC rules

            https://review.whamcloud.com/c/fs/lustre-release/+/39585

            TBF rule can borrow the code from PCC to define a rule with a range of PROJID values with '<' or '>' comparator.

             

            We also have a patch for aggregate shared rate limiting for TBF:

            https://review.whamcloud.com/#/c/fs/lustre-release/+/56351/

            qian_wc Qian Yingjin added a comment - We have a patch for PCC implementing the similar functionality: LU-13881 pcc: comparator support for PCC rules https://review.whamcloud.com/c/fs/lustre-release/+/39585 TBF rule can borrow the code from PCC to define a rule with a range of PROJID values with '<' or '>' comparator.   We also have a patch for aggregate shared rate limiting for TBF: https://review.whamcloud.com/#/c/fs/lustre-release/+/56351/

            Alternately (or in addition), would it be easier/better to use the TBF nodemap functionality described in LU-17902 to directly aggregate all RPCs from a given nodemap into one bucket rather than going through the indirection of using a nodemap range?

            adilger Andreas Dilger added a comment - Alternately (or in addition), would it be easier/better to use the TBF nodemap functionality described in LU-17902 to directly aggregate all RPCs from a given nodemap into one bucket rather than going through the indirection of using a nodemap range?

            eaujames, qian_wc, would the TBF rules allow (with the above addition of pb_projid to the RPC structure) restricting the RPC rate for a range of PROJID values? The patch https://review.whamcloud.com/55943 "LU-18109 utils: adding nodemap offset capability LU-18109" allows groups of clients in a nodemap to be assigned a range of UID/GID/PROJID values, and I'm wondering if the TBF rate limiting could be similarly used to aggregate a range of PROJID values into a single bucket so that they can be constrained to e.g. 10% of the storage bandwidth/IOPS, or similar?

            adilger Andreas Dilger added a comment - eaujames , qian_wc , would the TBF rules allow (with the above addition of pb_projid to the RPC structure) restricting the RPC rate for a range of PROJID values? The patch https://review.whamcloud.com/55943 " LU-18109 utils: adding nodemap offset capability LU-18109 " allows groups of clients in a nodemap to be assigned a range of UID/GID/PROJID values, and I'm wondering if the TBF rate limiting could be similarly used to aggregate a range of PROJID values into a single bucket so that they can be constrained to e.g. 10% of the storage bandwidth/IOPS, or similar?

            "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54920
            Subject: LU-17166 ptlrpc: add pb_projid field in ptlrpc_body
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 3404e9f4800cf8f730dae3e78a71ea622ca23496

            gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54920 Subject: LU-17166 ptlrpc: add pb_projid field in ptlrpc_body Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 3404e9f4800cf8f730dae3e78a71ea622ca23496

            I'm not sure this cache would be safe/correct. The jobid is a function of the process itself and set at process startup, so caching the jobid by PID is reasonable. The projid is a function of the file being accessed, so it may change randomly based on which file is being used, and doesn't necessarily have a strong correlation to the PID. Consider "find" or "tar" crossing multiple directory trees, or some job that is accessing shared input files from a common directory, and then writing to the user's own directory.

            For the projid, it will almost always be associated with a specific file/directory, so this information should be available at some point in the request processing. Even though "statfs()" is fairly generic, and will usually be executed on the root directory (which normally does not have a projid), but in some cases it is run on a subdirectory that has a projid set and will generate a quota lookup in addition to the statfs, so this is useful to be able to track.

            adilger Andreas Dilger added a comment - I'm not sure this cache would be safe/correct. The jobid is a function of the process itself and set at process startup, so caching the jobid by PID is reasonable. The projid is a function of the file being accessed, so it may change randomly based on which file is being used, and doesn't necessarily have a strong correlation to the PID. Consider "find" or "tar" crossing multiple directory trees, or some job that is accessing shared input files from a common directory, and then writing to the user's own directory. For the projid, it will almost always be associated with a specific file/directory, so this information should be available at some point in the request processing. Even though " statfs() " is fairly generic, and will usually be executed on the root directory (which normally does not have a projid), but in some cases it is run on a subdirectory that has a projid set and will generate a quota lookup in addition to the statfs, so this is useful to be able to track.

            Hi Andreas,
            I haven't taken the time to check this yet. I am busy with other subjects. But I think we could implement a projid cache (like the jobid_cache) to save the last used projid by a pid. The entries could be updated when the processes fetch the attr on the MDS.

            eaujames Etienne Aujames added a comment - Hi Andreas, I haven't taken the time to check this yet. I am busy with other subjects. But I think we could implement a projid cache (like the jobid_cache) to save the last used projid by a pid. The entries could be updated when the processes fetch the attr on the MDS.

            Etienne, would you be able to make a patch to set the o_projid and mbo_projid field for more RPC types? Even before we start to add an NRS TBF policy to handle this, it is useful for clients to start sending this information in RPCs so that it will be available when it is needed.

            adilger Andreas Dilger added a comment - Etienne, would you be able to make a patch to set the o_projid and mbo_projid field for more RPC types? Even before we start to add an NRS TBF policy to handle this, it is useful for clients to start sending this information in RPCs so that it will be available when it is needed.

            Ok, that should work for ost_io service.
            For now, o_projid is set only for write request, but we can modify the client to send it for write and read.
            So for now, we could restrict the development to ost_io (that should be easy).
            But for ost and mdt services, that should require more work on the client to include projid in the requests.

            eaujames Etienne Aujames added a comment - Ok, that should work for ost_io service. For now, o_projid is set only for write request, but we can modify the client to send it for write and read. So for now, we could restrict the development to ost_io (that should be easy). But for ost and mdt services, that should require more work on the client to include projid in the requests.

            People

              eaujames Etienne Aujames
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated: