<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:36:04 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-17515] dynamically shrink &apos;conns_per_peer&apos; as needed</title>
                <link>https://jira.whamcloud.com/browse/LU-17515</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;If there is a mismatch between &lt;tt&gt;conns_per_peer&lt;/tt&gt; on a client and server (e.g. different Ethernet network speed across Ethernet switches, or other reasons below) then each side will try to establish a different number of TCP sockets for the peer.  &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17258&quot; title=&quot;socklnd connection type not established upon connection race&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17258&quot;&gt;LU-17258&lt;/a&gt; is handling this by &quot;giving up&quot; on establishing more peer connections, as long as one could be established for each type.&lt;/p&gt;

&lt;p&gt;When this happens, the client should save the &lt;tt&gt;conn_count&lt;/tt&gt; as the new (in memory, until next unmount/remount) &lt;tt&gt;conns_per_peer&lt;/tt&gt; value the remote peer NID, so that it doesn&apos;t continue trying to establish more connections whenever there is a problem.&lt;/p&gt;

&lt;p&gt;Otherwise, the server will have to handle and reject these connections on a regular basis, which may seem like a DDOS if 10000 clients are all trying to (re-)establish thousands of connections at mount, recovery, or whenever there is a network hiccup.  This makes the configuration more &quot;hands off&quot; without the need to tune &lt;tt&gt;conns_per_peer&lt;/tt&gt; explicitly (and in coordination) across all nodes.&lt;/p&gt;


&lt;p&gt;It is likely that the servers also need to dynamically shrink &lt;tt&gt;conns_per_peer&lt;/tt&gt; when they start having a lot of connected peers to avoid the need to explicitly tune this for large clusters (and make us get involved to fix the system after it breaks).  This will (eventually) cause the remote peers to also shrink their connection count over time due to their backoff of failed connections.  I&apos;m thinking something simple like shrinking &lt;tt&gt;conns_per_peer&lt;/tt&gt; by 1 as the number of established peer connections grows past 20000 and again at 40000 (if it hasn&apos;t already started shrinking the number of connections when passing 20000). It couldn&apos;t be set &amp;lt; 1.&lt;/p&gt;

&lt;p&gt;It could print a console message when this is done, suggesting to &quot;&lt;tt&gt;set &apos;options socklnd conns_per_peer=N&apos; in /etc/modprobe.d/lustre.conf&lt;/tt&gt; to avoid this in the future&quot;, but at least the system would continue to work.&lt;/p&gt;

&lt;p&gt;I don&apos;t know if the server would need to actively &lt;b&gt;disconnect&lt;/b&gt; client connections &amp;gt; &lt;tt&gt;conns_per_peer&lt;/tt&gt;, but that might be needed if the number of connections continues to grow (e.g. &amp;gt; 50000).&lt;/p&gt;

&lt;p&gt;It would never increase &lt;tt&gt;conns_per_peer&lt;/tt&gt; until the system is restarted, or maybe if explicitly set from userspace again if the admin really thinks they know better.&lt;/p&gt;

&lt;p&gt;I&apos;ve also filed &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17514&quot; title=&quot;parameter hint for expected number of connected clients&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17514&quot;&gt;LU-17514&lt;/a&gt; for tracking an &quot;&lt;tt&gt;expected_clients&lt;/tt&gt;&quot; tunable that can be used to set a ballpark figure for the number of clients, so that various runtime parameters like &lt;tt&gt;conns_per_peer&lt;/tt&gt; could be set appropriately early in the cluster mount process.  &lt;/p&gt;</description>
                <environment></environment>
        <key id="80711">LU-17515</key>
            <summary>dynamically shrink &apos;conns_per_peer&apos; as needed</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                    </labels>
                <created>Wed, 7 Feb 2024 22:25:01 +0000</created>
                <updated>Fri, 9 Feb 2024 21:45:14 +0000</updated>
                                            <version>Lustre 2.14.0</version>
                    <version>Lustre 2.16.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="403437" author="adilger" created="Fri, 9 Feb 2024 21:45:14 +0000"  >&lt;p&gt;I was thinking about this further, and I&apos;m wondering if the number of connections per peer should be more dynamic at runtime rather than &quot;establish N connections immediately at mount time&quot;?&lt;/p&gt;

&lt;p&gt;Essentially, &lt;tt&gt;conns_per_peer&lt;/tt&gt; would be considered as &quot;maximum number of peer connections&quot; and ksocklnd would start with only 1 connection per peer (maybe not even per peer NID) until there was a substantial amount of traffic flowing to the peer. The the node would dynamically add new connections as long as this increased the real message transfer rate, and the server did not reject the connection with &lt;tt&gt;&amp;#45;EALREADY&lt;/tt&gt;, and did not exceed &lt;tt&gt;conns_per_peer&lt;/tt&gt;.  Once the client is finished its IO burst it would dynamically drop connections again.&lt;/p&gt;

&lt;p&gt;That would allow the &quot;single busy client&quot; case to get peak bandwidth, while the &quot;many clients&quot; case would immediately be handled by 1 initial connection and the server would just not allow it to escalate if it was busy or had too many connections. Depending on how long it takes to establish a new connection, we might even consider to drop idle bulk read/write connections to 0 and only keep the control connection for pings and small messages. &lt;/p&gt;

&lt;p&gt;I think that behavior gives us the best of both worlds - peak bandwidth when a single client can drive it, without overloading the server when its network/storage is the limit. &lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="80709">LU-17513</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="78758">LU-17258</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="80710">LU-17514</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i04aqv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>