Details

    • New Feature
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • None
    • 17614

    Description

      We'd like to be able to perturb the timing of request processing at the PtlRPC layer with the goal being to simulate high server load, and find and expose timing related problems.

      Our initial idea is to create an NRS policy that will delay request handling for some configurable amount of time. When the policy is started and a request arrives the policy will calculate an offset, within a defined, user-configurable range, from the request arrival time to set a request "start time". We can use the cfs_binheap implementation to store these requests and sort them based on this "start time". Request's are then removed from the binheap for handling only once we've reached/passed their start time. We could also choose to only delay some % of requests by allowing the request enqueue to fallback to FIFO (or whatever).

      I have an initial implementation mostly done (just need to finish up lprocfs stuff). I appreciate any thoughts on this approach.

      Attachments

        Issue Links

          Activity

            [LU-6283] NRS Delay Policy
            spitzcor Cory Spitz added a comment -

            We should open an LUDOC ticket to track any needed doc updates for this policy.

            spitzcor Cory Spitz added a comment - We should open an LUDOC ticket to track any needed doc updates for this policy.
            sarah Sarah Liu added a comment -

            Hello Cory,

            If Chris could upload the test plan in this ticket then I can just close LU-6583. LU-6583 is for tracking the test plan.

            sarah Sarah Liu added a comment - Hello Cory, If Chris could upload the test plan in this ticket then I can just close LU-6583. LU-6583 is for tracking the test plan.
            pjones Peter Jones added a comment -

            I'm not really clear on what the intent of creating such tickets is, but JIRA tickets can only be assigned to HPDD engineers atm

            pjones Peter Jones added a comment - I'm not really clear on what the intent of creating such tickets is, but JIRA tickets can only be assigned to HPDD engineers atm
            spitzcor Cory Spitz added a comment -

            Hi, Sarah. Perhaps an oversight, but LU-6583 is assigned to HPDD Triage. Should that have been assigned to Chris?

            spitzcor Cory Spitz added a comment - Hi, Sarah. Perhaps an oversight, but LU-6583 is assigned to HPDD Triage. Should that have been assigned to Chris?
            sarah Sarah Liu added a comment -

            Hello Chris, if NRS Delay is targeted for 2.8, could you please upload the test plan by feature freeze? I created a ticket for tracking the test plan: https://jira.hpdd.intel.com/browse/LU-6583

            Thanks!

            sarah Sarah Liu added a comment - Hello Chris, if NRS Delay is targeted for 2.8, could you please upload the test plan by feature freeze? I created a ticket for tracking the test plan: https://jira.hpdd.intel.com/browse/LU-6583 Thanks!
            hornc Chris Horn added a comment -

            Hi Andreas,
            I've just pushed my code. My hope is that this is mostly complete, so I should be able to address any review feedback quickly. In any case, I'll be sure to dedicate appropriate resources so this can land for 2.8.

            hornc Chris Horn added a comment - Hi Andreas, I've just pushed my code. My hope is that this is mostly complete, so I should be able to address any review feedback quickly. In any case, I'll be sure to dedicate appropriate resources so this can land for 2.8.

            Chris Horn (hornc@cray.com) uploaded a new patch: http://review.whamcloud.com/14701
            Subject: LU-6283 ptlrpc: Implement NRS Delay Policy
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: db28ca8c9d8008d3bc00e8c1e77b60d107cdaf1d

            gerrit Gerrit Updater added a comment - Chris Horn (hornc@cray.com) uploaded a new patch: http://review.whamcloud.com/14701 Subject: LU-6283 ptlrpc: Implement NRS Delay Policy Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: db28ca8c9d8008d3bc00e8c1e77b60d107cdaf1d

            Hi Chris, is there still a plan to land this feature for 2.8? As yet I haven't seen any signs that this is being worked on, but if 2.8 is still the target release then we need to start planning for its landing before the feature freeze.

            adilger Andreas Dilger added a comment - Hi Chris, is there still a plan to land this feature for 2.8? As yet I haven't seen any signs that this is being worked on, but if 2.8 is still the target release then we need to start planning for its landing before the feature freeze.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14003/
            Subject: LU-6283 ptlrpc: re-add NRS policy registration symbol exports
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 5167bcea2175c751c79173fd934cddcb4cd9fa7b

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14003/ Subject: LU-6283 ptlrpc: re-add NRS policy registration symbol exports Project: fs/lustre-release Branch: master Current Patch Set: Commit: 5167bcea2175c751c79173fd934cddcb4cd9fa7b
            nangelinas Nikitas Angelinas added a comment - - edited

            The ptlrpc_nrs_policy_(register|unregister)() functions should allow for loading/unloading policies on demand from modules other than ptlrpc; the symbols used to be exported, but were unexported as part of LU-5829; I uploaded a short patch to re-add them at http://review.whamcloud.com/#/c/14003/.

            I had tested this feature when we first landed NRS and it appeared to work fine.

            nangelinas Nikitas Angelinas added a comment - - edited The ptlrpc_nrs_policy_(register|unregister)() functions should allow for loading/unloading policies on demand from modules other than ptlrpc; the symbols used to be exported, but were unexported as part of LU-5829 ; I uploaded a short patch to re-add them at http://review.whamcloud.com/#/c/14003/ . I had tested this feature when we first landed NRS and it appeared to work fine.

            Nikitas Angelinas (nikitas.angelinas@seagate.com) uploaded a new patch: http://review.whamcloud.com/14003
            Subject: LU-6283 ptlrpc: re-add NRS policy registration symbol exports
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 6b552ecf150edad57e14ae1795d809fe4da3fd5c

            gerrit Gerrit Updater added a comment - Nikitas Angelinas (nikitas.angelinas@seagate.com) uploaded a new patch: http://review.whamcloud.com/14003 Subject: LU-6283 ptlrpc: re-add NRS policy registration symbol exports Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 6b552ecf150edad57e14ae1795d809fe4da3fd5c

            People

              hornc Chris Horn
              hornc Chris Horn
              Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: