<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:26:13 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-9441] Use kernel threads in predictable fashion to confine OS noise</title>
                <link>https://jira.whamcloud.com/browse/LU-9441</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;During benchmarking at large scale on KNL+Omnipath system (8k nodes) we saw periodic OS noise due to lustre kernel threads that greatly affects the performance of small-message MPI collectives (MPI_Barrier, MPI_Allreduce with small datasize, etc).&lt;/p&gt;

&lt;p&gt;The request is for lustre to use kernel threads in a deterministic manner when the load is low. In this case the ideal usage would have been to use only the thread(s) in CPT 0 which can be setup to run only on KNL tile 0 (cores 0 and 1). Then when no significant I/O is going on, a benchmark thread can be bound to a tile other than 0 and not see any lustre noise.&lt;/p&gt;

&lt;p&gt;This is more than just a benchmarking setting, it is common for allreduce to be a bottleneck especially at scale, and usually HPC applications have long phases with no I/O.&lt;/p&gt;</description>
                <environment></environment>
        <key id="45844">LU-9441</key>
            <summary>Use kernel threads in predictable fashion to confine OS noise</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="simmonsja">James A Simmons</assignee>
                                    <reporter username="lfmeadow">Larry Meadows</reporter>
                        <labels>
                    </labels>
                <created>Wed, 3 May 2017 14:16:11 +0000</created>
                <updated>Wed, 17 Feb 2021 23:20:39 +0000</updated>
                            <resolved>Wed, 17 Feb 2021 23:20:38 +0000</resolved>
                                                    <fixVersion>Lustre 2.14.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="194460" author="jgmitter" created="Thu, 4 May 2017 15:56:10 +0000"  >&lt;p&gt;Hi Dmitry,&lt;/p&gt;

&lt;p&gt;Can you please investigate this report?&lt;/p&gt;

&lt;p&gt;Thanks.&lt;br/&gt;
Joe&lt;/p&gt;</comment>
                            <comment id="195251" author="lfmeadow" created="Wed, 10 May 2017 14:10:58 +0000"  >&lt;p&gt;Looking at the lustre code for ptlrpdc it appears that every ptlrpcd thread wakes up at least 1x/second regardless of activity.&lt;br/&gt;
I have profiles from the KNL machine at TACC (Stampede) showing activity on 60 different lustre threads even with no I/O going on. Kernel tracing also confirms this.&lt;br/&gt;
I need a resolution for this, it seriously affects performance of MPI collectives, especially as the node count increases.&lt;br/&gt;
Please respond with your plans to resolve this issue.&lt;/p&gt;</comment>
                            <comment id="195268" author="jgmitter" created="Wed, 10 May 2017 15:03:46 +0000"  >&lt;p&gt;Hi Larry,&lt;/p&gt;

&lt;p&gt;This is on Dmitry&apos;s plate to investigate and propose potential solutions.  We will post to the ticket as soon as possible.&lt;/p&gt;

&lt;p&gt;Thanks.&lt;br/&gt;
Joe&lt;/p&gt;</comment>
                            <comment id="205189" author="gerrit" created="Fri, 11 Aug 2017 19:28:22 +0000"  >&lt;p&gt;Amir Shehata (amir.shehata@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/28496&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/28496&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9441&quot; title=&quot;Use kernel threads in predictable fashion to confine OS noise&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9441&quot;&gt;&lt;del&gt;LU-9441&lt;/del&gt;&lt;/a&gt; pltrpc: don&apos;t wakeup on 1 second intervals&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 6ddeed98c9f65486743b2a30b7ea0468fb4cf952&lt;/p&gt;</comment>
                            <comment id="225950" author="adilger" created="Fri, 13 Apr 2018 00:21:28 +0000"  >&lt;p&gt;The &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9660&quot; title=&quot;reduce ptlrpcd wakeups on idle system&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9660&quot;&gt;&lt;del&gt;LU-9660&lt;/del&gt;&lt;/a&gt; patch was landed, but I&apos;m wondering if there is more we can do here.  For example, batching inodes by NID ranges (i.e. submask) so that they ping at the same time.  For example, if we scheduled pings on clients so that &lt;tt&gt;seconds % interval == (NID &amp;gt;&amp;gt; 9) % interval&lt;/tt&gt; was true, then groups of 2^9 = 512 nodes would ping in the same second.  For jobs that are typically scheduled with &quot;nearby&quot; NIDs, or for small clusters, this would put the ping overhead in one timestep, instead of introducing jitter across all timesteps.&lt;/p&gt;

&lt;p&gt;Landing &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7236&quot; title=&quot;connections on demand&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7236&quot;&gt;&lt;del&gt;LU-7236&lt;/del&gt;&lt;/a&gt; would help this further, as we don&apos;t need pinging if the client has disconnected from the server(s) (at the expense of a small latency hit when it next sends an RPC to the server again).&lt;/p&gt;</comment>
                            <comment id="233648" author="adilger" created="Mon, 17 Sep 2018 23:31:29 +0000"  >&lt;p&gt;Similarly, if a single application was batching up operations for the servers in a coordinated manner between clients, this should be staggered to avoid contention between jobs, and would avoid adding random jitter to all timesteps of a computation.  Only one timestep per period would be impacted by the background work. Something like ping on &lt;tt&gt;time % (hash(jobid) &amp;amp; 15)&lt;/tt&gt; or similar could be used to coordinate autonomously between clients running the same job.&lt;/p&gt;</comment>
                            <comment id="266287" author="simmonsja" created="Fri, 27 Mar 2020 19:34:53 +0000"  >&lt;p&gt;&lt;a href=&quot;https://review.whamcloud.com/#/c/38091/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/38091/&lt;/a&gt;&#160;should limit the pinger noise. Any other ideas?&lt;/p&gt;</comment>
                            <comment id="266300" author="adilger" created="Sat, 28 Mar 2020 05:35:46 +0000"  >&lt;p&gt;Your patch isolates the pinger &lt;em&gt;thread&lt;/em&gt; on the client, but it doesn&apos;t do anything to isolate the network traffic to avoid jitter in the communication.&lt;/p&gt;

&lt;p&gt;I think one of the things that has been lost with the timer changes for the ping (and other) wakeups is coordination between the threads and clients.  On the one hand, you don&apos;t want &lt;b&gt;all&lt;/b&gt; clients and threads to wake at the same time, but some amount of coordination (i.e. a few bad timesteps across all threads at one time and then many good timesteps in a row) is better than each thread having a &lt;b&gt;different&lt;/b&gt; bad timestep on a continual basis.&lt;/p&gt;

&lt;p&gt;As I described in my previous comments here, aligning the client pings by NID or JobID (e.g. &quot;&lt;tt&gt;(seconds &amp;amp; interval) == (hash(jobid) &amp;amp; interval)&lt;/tt&gt;&quot;, where &lt;tt&gt;interval&lt;/tt&gt; is a power-of-two value close to the ping interval) would minimize the number of bad timesteps and maximize good ones.  The synchronicity would depend on how well-sync&apos;d the client clocks are, but it could be isolated to a few msec per tens of seconds.&lt;/p&gt;

&lt;p&gt;The same is true of some of the other timeouts - they could be aligned to happen on a coarse-grained interval (every second or few seconds) rather than randomly, so that when some work needs to be done, there is a bunch, and then it is quiet again as long as possible.&lt;/p&gt;</comment>
                            <comment id="266301" author="adilger" created="Sat, 28 Mar 2020 05:37:12 +0000"  >&lt;p&gt;Some of this was discussed in the context of &lt;a href=&quot;https://review.whamcloud.com/36701&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36701&lt;/a&gt; but I don&apos;t think it was implemented.&lt;/p&gt;</comment>
                            <comment id="271316" author="gerrit" created="Wed, 27 May 2020 17:28:31 +0000"  >&lt;p&gt;James Simmons (jsimmons@infradead.org) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/38730&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/38730&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9441&quot; title=&quot;Use kernel threads in predictable fashion to confine OS noise&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9441&quot;&gt;&lt;del&gt;LU-9441&lt;/del&gt;&lt;/a&gt; llite: bind kthread thread to accepted node set&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 3b0e24c39a994463b6eaaa8c424c33b3815280ad&lt;/p&gt;</comment>
                            <comment id="273314" author="gerrit" created="Fri, 19 Jun 2020 16:50:29 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/38730/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/38730/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9441&quot; title=&quot;Use kernel threads in predictable fashion to confine OS noise&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9441&quot;&gt;&lt;del&gt;LU-9441&lt;/del&gt;&lt;/a&gt; llite: bind kthread thread to accepted node set&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: d6e103e6950d99b88141d4b26982889258c774c5&lt;/p&gt;</comment>
                            <comment id="273372" author="pjones" created="Fri, 19 Jun 2020 22:11:16 +0000"  >&lt;p&gt;Landed for 2.14&lt;/p&gt;</comment>
                            <comment id="273388" author="simmonsja" created="Fri, 19 Jun 2020 23:54:30 +0000"  >&lt;p&gt;Andreas wants an offset to running the threads based on the jobid. I think that is one more patch.&lt;/p&gt;</comment>
                            <comment id="273403" author="adilger" created="Sat, 20 Jun 2020 08:09:37 +0000"  >&lt;p&gt;James, that could be moved to a separate ticket. &lt;/p&gt;</comment>
                            <comment id="292244" author="pjones" created="Wed, 17 Feb 2021 23:20:39 +0000"  >&lt;p&gt;2.14 is closing so let&apos;s track anything else under a new ticket&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="46688">LU-9660</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="22595">LU-4423</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="32398">LU-7236</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="58107">LU-13258</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10030" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic/Theme</customfieldname>
                        <customfieldvalues>
                                        <label>Performance</label>
            <label>jitter</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzzbtr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>