<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:58:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6283] NRS Delay Policy</title>
                <link>https://jira.whamcloud.com/browse/LU-6283</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We&apos;d like to be able to perturb the timing of request processing at the PtlRPC layer with the goal being to simulate high server load, and find and expose timing related problems.&lt;/p&gt;

&lt;p&gt;Our initial idea is to create an NRS policy that will delay request handling for some configurable amount of time. When the policy is started and a request arrives the policy will calculate an offset, within a defined, user-configurable range, from the request arrival time to set a request &quot;start time&quot;. We can use the cfs_binheap implementation to store these requests and sort them based on this &quot;start time&quot;. Request&apos;s are then removed from the binheap for handling only once we&apos;ve reached/passed their start time. We could also choose to only delay some % of requests by allowing the request enqueue to fallback to FIFO (or whatever).&lt;/p&gt;

&lt;p&gt;I have an initial implementation mostly done (just need to finish up lprocfs stuff). I appreciate any thoughts on this approach.&lt;/p&gt;</description>
                <environment></environment>
        <key id="28846">LU-6283</key>
            <summary>NRS Delay Policy</summary>
                <type id="2" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11311&amp;avatarType=issuetype">New Feature</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="hornc">Chris Horn</assignee>
                                    <reporter username="hornc">Chris Horn</reporter>
                        <labels>
                            <label>cea</label>
                            <label>patch</label>
                    </labels>
                <created>Wed, 25 Feb 2015 20:16:28 +0000</created>
                <updated>Wed, 29 May 2019 17:20:42 +0000</updated>
                            <resolved>Sun, 23 Apr 2017 03:57:37 +0000</resolved>
                                                    <fixVersion>Lustre 2.10.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>17</watches>
                                                                            <comments>
                            <comment id="108064" author="adilger" created="Thu, 26 Feb 2015 06:42:16 +0000"  >&lt;p&gt;I think this is a very interesting way to do fault injection and load simulation.&lt;/p&gt;

&lt;p&gt;If you are working in the NRS code, there are a number of things that could be cleaned up.  &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2667&quot; title=&quot;move NRS structures/definitions from lustre_net.h to new lustre_nrs.h header&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2667&quot;&gt;&lt;del&gt;LU-2667&lt;/del&gt;&lt;/a&gt; is one example.&lt;/p&gt;

&lt;p&gt;Also, having the ability to dynamically load NRS modules via a register and deregister function would allow the testing policy to only be loaded when needed.  This was discussed several times in the context of the original NRS patches, but I&apos;m unable to find those comments right now.&lt;/p&gt;</comment>
                            <comment id="109126" author="gerrit" created="Fri, 6 Mar 2015 21:45:09 +0000"  >&lt;p&gt;Nikitas Angelinas (nikitas.angelinas@seagate.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/14003&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/14003&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6283&quot; title=&quot;NRS Delay Policy&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6283&quot;&gt;&lt;del&gt;LU-6283&lt;/del&gt;&lt;/a&gt; ptlrpc: re-add NRS policy registration symbol exports&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 6b552ecf150edad57e14ae1795d809fe4da3fd5c&lt;/p&gt;</comment>
                            <comment id="109130" author="nangelinas" created="Fri, 6 Mar 2015 21:54:13 +0000"  >&lt;p&gt;The ptlrpc_nrs_policy_(register|unregister)() functions should allow for loading/unloading policies on demand from modules other than ptlrpc; the symbols used to be exported, but were unexported as part of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5829&quot; title=&quot;too many EXPORT_SYMBOL in code&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5829&quot;&gt;&lt;del&gt;LU-5829&lt;/del&gt;&lt;/a&gt;; I uploaded a short patch to re-add them at  &lt;a href=&quot;http://review.whamcloud.com/#/c/14003/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/14003/&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I had tested this feature when we first landed NRS and it appeared to work fine.&lt;/p&gt;</comment>
                            <comment id="109966" author="gerrit" created="Wed, 18 Mar 2015 11:20:26 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/14003/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/14003/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6283&quot; title=&quot;NRS Delay Policy&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6283&quot;&gt;&lt;del&gt;LU-6283&lt;/del&gt;&lt;/a&gt; ptlrpc: re-add NRS policy registration symbol exports&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 5167bcea2175c751c79173fd934cddcb4cd9fa7b&lt;/p&gt;</comment>
                            <comment id="114450" author="adilger" created="Thu, 7 May 2015 02:01:25 +0000"  >&lt;p&gt;Hi Chris, is there still a plan to land this feature for 2.8?  As yet I haven&apos;t seen any signs that this is being worked on, but if 2.8 is still the target release then we need to start planning for its landing before the feature freeze.&lt;/p&gt;</comment>
                            <comment id="114454" author="gerrit" created="Thu, 7 May 2015 03:49:12 +0000"  >&lt;p&gt;Chris Horn (hornc@cray.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/14701&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/14701&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6283&quot; title=&quot;NRS Delay Policy&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6283&quot;&gt;&lt;del&gt;LU-6283&lt;/del&gt;&lt;/a&gt; ptlrpc: Implement NRS Delay Policy&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: db28ca8c9d8008d3bc00e8c1e77b60d107cdaf1d&lt;/p&gt;</comment>
                            <comment id="114455" author="hornc" created="Thu, 7 May 2015 03:54:00 +0000"  >&lt;p&gt;Hi Andreas,&lt;br/&gt;
I&apos;ve just pushed my code. My hope is that this is mostly complete, so I should be able to address any review feedback quickly. In any case, I&apos;ll be sure to dedicate appropriate resources so this can land for 2.8.&lt;/p&gt;</comment>
                            <comment id="114649" author="sarah" created="Thu, 7 May 2015 21:19:26 +0000"  >&lt;p&gt;Hello Chris, if NRS Delay is targeted for 2.8, could you please upload the test plan by feature freeze? I created a ticket for tracking the test plan:  &lt;a href=&quot;https://jira.hpdd.intel.com/browse/LU-6583&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://jira.hpdd.intel.com/browse/LU-6583&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks!&lt;/p&gt;</comment>
                            <comment id="120061" author="spitzcor" created="Wed, 1 Jul 2015 16:22:56 +0000"  >&lt;p&gt;Hi, Sarah.  Perhaps an oversight, but LU-6583 is assigned to HPDD Triage.  Should that have been assigned to Chris?&lt;/p&gt;</comment>
                            <comment id="120142" author="pjones" created="Thu, 2 Jul 2015 13:26:00 +0000"  >&lt;p&gt;I&apos;m not really clear on what the intent of creating such tickets is, but JIRA tickets can only be assigned to HPDD engineers atm&lt;/p&gt;</comment>
                            <comment id="120171" author="sarah" created="Thu, 2 Jul 2015 16:50:30 +0000"  >&lt;p&gt;Hello Cory, &lt;/p&gt;

&lt;p&gt;If Chris could upload the test plan in this ticket then I can just close LU-6583. LU-6583 is for tracking the test plan.&lt;/p&gt;</comment>
                            <comment id="135088" author="spitzcor" created="Thu, 3 Dec 2015 15:40:36 +0000"  >&lt;p&gt;We should open an LUDOC ticket to track any needed doc updates for this policy.&lt;/p&gt;</comment>
                            <comment id="146521" author="bevans" created="Tue, 22 Mar 2016 19:46:14 +0000"  >&lt;p&gt;Also need something to add the design to:  &lt;a href=&quot;http://wiki.lustre.org/Projects&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://wiki.lustre.org/Projects&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="187413" author="hornc" created="Tue, 7 Mar 2017 23:50:12 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LUDOC-366&quot; title=&quot;Document NRS Delay&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LUDOC-366&quot;&gt;&lt;del&gt;LUDOC-366&lt;/del&gt;&lt;/a&gt; opened to track doc changes.&lt;/p&gt;</comment>
                            <comment id="191221" author="jamesanunez" created="Fri, 7 Apr 2017 17:29:48 +0000"  >&lt;p&gt;Chris - I think the only thing we are waiting on to land this feature is some kind of a feature test plan or test report. We are looking for some indication of what testing you have done to verify that this feature is working correctly and any tests added to the Lustre test suites to make sure this feature functions correctly in the future. &lt;/p&gt;

&lt;p&gt;Please attach a test plan/report to this ticket and, I think, the feature can move ahead for Lustre 2.10. &lt;/p&gt;

&lt;p&gt;If there&apos;s any questions about what we are looking for, you are welcome to contact me.&lt;/p&gt;

&lt;p&gt;Thanks, James&lt;/p&gt;</comment>
                            <comment id="192577" author="hornc" created="Tue, 18 Apr 2017 20:44:48 +0000"  >&lt;p&gt;test_77l() is added as part of the feature implementation. It provides a good example of how other test cases might be written for other ptlrpc services. It verifies NRS delay works correctly for the ost_io service by generating a specific number of write RPCs, measuring the time it takes for the I/O to complete and comparing that with the amount of delay we configured for the ost_io service.&lt;/p&gt;

&lt;p&gt;I did similar testing in development using a range of delay values to ensure that the random delay logic was working as expected. For example:&lt;/p&gt;

&lt;p&gt;With a range of delay values between 1 and 10:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl set_param ost.OSS.ost_io.nrs_policies=delay \
				       ost.OSS.ost_io.nrs_delay_min=1 \
				       ost.OSS.ost_io.nrs_delay_max10 \
				       ost.OSS.ost_io.nrs_delay_pct=100
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We&apos;d expect a 1MB write to take at least 1 second, and not much more than 10 seconds (for an idle filesystem it should effectively be 10-11 seconds).&lt;/p&gt;

&lt;p&gt;I performed testing in development to ensure the tunables were doing some proper sanitization of their inputs. Things like setting min &amp;gt; max, delay_pct &amp;lt; 0 and &amp;gt; 100, etc.&lt;/p&gt;

&lt;p&gt;This was all pretty informal so I do not have data on the results of those testing except to say that I fixed any bugs that I found.&lt;/p&gt;

&lt;p&gt;If the community feels that additional unit testing of the sort I&apos;ve described is warranted then I can surely work to generate additional unit tests for sanityn.&lt;/p&gt;</comment>
                            <comment id="193126" author="gerrit" created="Sun, 23 Apr 2017 03:30:31 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/14701/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/14701/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6283&quot; title=&quot;NRS Delay Policy&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6283&quot;&gt;&lt;del&gt;LU-6283&lt;/del&gt;&lt;/a&gt; ptlrpc: Implement NRS Delay Policy&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 588831e9eac38b8514f2a3e71516b44fa7c4bcce&lt;/p&gt;</comment>
                            <comment id="193135" author="pjones" created="Sun, 23 Apr 2017 03:57:37 +0000"  >&lt;p&gt;Landed for 2.10&lt;/p&gt;</comment>
                            <comment id="247982" author="cfaber" created="Wed, 29 May 2019 17:17:06 +0000"  >&lt;p&gt;Should this be closed?&lt;/p&gt;</comment>
                            <comment id="247986" author="pjones" created="Wed, 29 May 2019 17:20:42 +0000"  >&lt;p&gt;ok I am going to stop responding to these now. Hopefully you check your email before I get too many more of these...&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="11119">LU-398</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="44573">LUDOC-366</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzx727:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>17614</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>