<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:22:20 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15909] Track Peer/Network credits at peer net/net level - use for path selection</title>
                <link>https://jira.whamcloud.com/browse/LU-15909</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;If an lnet peer has multiple networks configured we will currently round robin between them regardless of the relative capabilities/capacities of each network. This can lead to a situation where 1/(# nets) traffic is sent via a slow protocol (like tcp) when it would be better to use a faster protocol.&lt;/p&gt;

&lt;p&gt;This situation can be remedied in Lustre 2.15 by defining net selection rules. This tells LNet to prioritize using some nets over other ones. But this may still not be ideal.&lt;/p&gt;

&lt;p&gt;Suppose we have a peer with two cassini interfaces and two ethernet interfaces:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;    - net type: tcp
      local NI(s):
        - nid: 172.18.2.3@tcp
          status: up
          interfaces:
              0: enp65s0
        - nid: 172.18.2.4@tcp
          status: up
          interfaces:
              0: enp65s1
    - net type: gni
      local NI(s):
        - nid: 17@gni
          status: up
        - nid: 18@gni
          status: up
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Without udsp, half of all traffic will be sent over the tcp network which is much slower than the gni network.&lt;/p&gt;

&lt;p&gt;With udsp, we can add a rule so that all traffic will be sent over the gni network unless there is a problem and the tcp interfaces have higher health value than the gni interfaces.&lt;/p&gt;

&lt;p&gt;This may seem ideal, but it could be the case that all available resources on the gni interfaces are consumed. In this case, LNet will queue messages on the gni interfaces until a resource becomes available. Meanwhile, the tcp interfaces may be completely idle.&lt;/p&gt;

&lt;p&gt;I propose to add resource tracking at the local net/peer net level. This will allow LNet to choose a &lt;em&gt;network&lt;/em&gt; (local or peer) based on the resources available in that network (which are simply the sum of the resources available to the NIs belonging to the network).&lt;/p&gt;

&lt;p&gt;This should allow us to get most of the benefit of the UDSP network selection rule but also enable us to fully leverage all network capacity in a more intelligent manner than round-robin across all nets.&lt;/p&gt;

&lt;p&gt;A side benefit is that we can get rid of the round robin behavior altogether. Round robin relies on sequence numbers, and there is a potential scenario where the round robin behavior can be broken.&lt;/p&gt;

&lt;p&gt;On every send to some peer we increment a sequence number for the source interface and a sequence number for the peer. The sequence numbers are unsigned 32 bit ints, so if we happen to wrap the sequence number in just the right way we can end up in a situation where the sequence of some ni is UINT_MAX but the next send sets some other NI to 0. Then all future sends get funneled to the NI with the lower sequence number.&lt;/p&gt;</description>
                <environment></environment>
        <key id="70608">LU-15909</key>
            <summary>Track Peer/Network credits at peer net/net level - use for path selection</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="hornc">Chris Horn</assignee>
                                    <reporter username="hornc">Chris Horn</reporter>
                        <labels>
                    </labels>
                <created>Thu, 2 Jun 2022 20:07:28 +0000</created>
                <updated>Fri, 3 Jun 2022 19:53:21 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="336716" author="gerrit" created="Fri, 3 Jun 2022 19:53:17 +0000"  >&lt;p&gt;&quot;Chris Horn &amp;lt;chris.horn@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47525&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47525&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15909&quot; title=&quot;Track Peer/Network credits at peer net/net level - use for path selection&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15909&quot;&gt;LU-15909&lt;/a&gt; lnet: Add peer NI send lists&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 3047ef173f50333f6867d591b3b219f5daace548&lt;/p&gt;</comment>
                            <comment id="336717" author="gerrit" created="Fri, 3 Jun 2022 19:53:18 +0000"  >&lt;p&gt;&quot;Chris Horn &amp;lt;chris.horn@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47526&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47526&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15909&quot; title=&quot;Track Peer/Network credits at peer net/net level - use for path selection&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15909&quot;&gt;LU-15909&lt;/a&gt; lnet: Use lnet_send_data for NI selection&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: ef7d4d75ae0938d5097dc5c5ec13c06c5d84ed8f&lt;/p&gt;</comment>
                            <comment id="336718" author="gerrit" created="Fri, 3 Jun 2022 19:53:19 +0000"  >&lt;p&gt;&quot;Chris Horn &amp;lt;chris.horn@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47527&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47527&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15909&quot; title=&quot;Track Peer/Network credits at peer net/net level - use for path selection&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15909&quot;&gt;LU-15909&lt;/a&gt; lnet: Correct net selection for router ping&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 32e781b1f6652a8520e47bae9917fba153020ab5&lt;/p&gt;</comment>
                            <comment id="336719" author="gerrit" created="Fri, 3 Jun 2022 19:53:19 +0000"  >&lt;p&gt;&quot;Chris Horn &amp;lt;chris.horn@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47528&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47528&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15909&quot; title=&quot;Track Peer/Network credits at peer net/net level - use for path selection&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15909&quot;&gt;LU-15909&lt;/a&gt; lnet: Add peer net send lists&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: a28ed176663bbe7c57a1c7f4b99aeae19a746f2f&lt;/p&gt;</comment>
                            <comment id="336720" author="gerrit" created="Fri, 3 Jun 2022 19:53:20 +0000"  >&lt;p&gt;&quot;Chris Horn &amp;lt;chris.horn@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47529&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47529&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15909&quot; title=&quot;Track Peer/Network credits at peer net/net level - use for path selection&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15909&quot;&gt;LU-15909&lt;/a&gt; lnet: Use net/peer net credits for net selection&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 464e85adbce415db0187de52da009480d2d2ad70&lt;/p&gt;</comment>
                            <comment id="336721" author="gerrit" created="Fri, 3 Jun 2022 19:53:21 +0000"  >&lt;p&gt;&quot;Chris Horn &amp;lt;chris.horn@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47530&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47530&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15909&quot; title=&quot;Track Peer/Network credits at peer net/net level - use for path selection&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15909&quot;&gt;LU-15909&lt;/a&gt; lnet: Refactor lnet_find_route_locked&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 2c26ca984c47618336df4ca258c62ade31b8c8dd&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i02ra7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>