<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:55:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12815] Create multiple TCP sockets per SockLND</title>
                <link>https://jira.whamcloud.com/browse/LU-12815</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;For high-bandwidth Ethernet interfaces (e.g. 100GigE), it would be useful to create multiple TCP connections per interface for bulk transfers in order to maximize performance (i.e. &lt;tt&gt;conns_per_peer=4&lt;/tt&gt; for socklnd in addition to o2iblnd).  We already have three separate TCP connections per LND - read, write, and small message.&lt;/p&gt;

&lt;p&gt;For large clusters this may be problematic because of the number of TCP connections to a server, but for smaller configurations this could be very useful.&lt;/p&gt;</description>
                <environment></environment>
        <key id="57020">LU-12815</key>
            <summary>Create multiple TCP sockets per SockLND</summary>
                <type id="7" iconUrl="https://jira.whamcloud.com/images/icons/issuetypes/task_agile.png">Technical task</type>
                            <parent id="61315">LU-14064</parent>
                                    <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="ashehata">Amir Shehata</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                            <label>performance</label>
                    </labels>
                <created>Fri, 27 Sep 2019 13:52:46 +0000</created>
                <updated>Tue, 20 Sep 2022 12:26:39 +0000</updated>
                            <resolved>Wed, 18 Aug 2021 12:44:10 +0000</resolved>
                                                    <fixVersion>Lustre 2.15.0</fixVersion>
                    <fixVersion>Lustre 2.12.10</fixVersion>
                                        <due></due>
                            <votes>2</votes>
                                    <watches>17</watches>
                                                                            <comments>
                            <comment id="255501" author="ashehata" created="Fri, 27 Sep 2019 18:47:14 +0000"  >&lt;p&gt;yes the conns_per_peer would be a good parameter to use.&lt;/p&gt;

&lt;p&gt;I looked at the ksocklnd, and currently there can exist only one unique route between two peers. This in effect translates to one tcp connection between the peers. I don&apos;t see a reason though why we can&apos;t create multiple tcp connections per peer, and when we select which connection to send from, we can iterate over these connections.&lt;/p&gt;

&lt;p&gt;However, come to think about it, there is a way to do it, albeit a bit more configuration. What they can do is create multiple virtual interfaces which use the same physical interface, then they can use Multi-Rail to group all these connections. The result will be that socklnd will create multiple connections, one to each of the virtual interfaces.&lt;/p&gt;

&lt;p&gt;Ex:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
ifconfig eth0:0 &amp;lt;ip&amp;gt;
ifconfig eth0:1 &amp;lt;ip&amp;gt;
ifconfig eth0:2 &amp;lt;ip&amp;gt;
ifconfig eth0:3 &amp;lt;ip&amp;gt;

lnetctl net add --net tcp --&lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; eth0:0,eth0:1,eth0:2,eth0:3&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The rest will be taken care of by the Multi-Rail algorithm.&lt;/p&gt;

&lt;p&gt;Would that be a sufficient solution?&lt;/p&gt;</comment>
                            <comment id="255536" author="sihara" created="Sat, 28 Sep 2019 04:25:53 +0000"  >&lt;p&gt;Yes, that workaround is exact what I did and confirmed bump up performance &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; but setting up four logical interfaces on all clients were very annoying and not good idea to create logical interfaces only for lustre. cons_per_peer is able to simplify configuration and improves performance.&lt;/p&gt;</comment>
                            <comment id="281916" author="ashehata" created="Fri, 9 Oct 2020 18:13:26 +0000"  >&lt;p&gt;Serguei is currently looking at this.&lt;/p&gt;</comment>
                            <comment id="282274" author="chunteraa" created="Wed, 14 Oct 2020 22:56:35 +0000"  >&lt;p&gt;Some ethernet drivers allow alternate hashing methods to better utilize adapter receive queues for small number of incoming TCP streams&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://docs.mellanox.com/display/MLNXOFEDv473290/RSS+Support&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.mellanox.com/display/MLNXOFEDv473290/RSS+Support&lt;/a&gt;&lt;/p&gt;



</comment>
                            <comment id="287315" author="rdruon" created="Fri, 11 Dec 2020 13:46:23 +0000"  >&lt;p&gt;Any update on this issue ?&lt;/p&gt;</comment>
                            <comment id="287352" author="ashehata" created="Fri, 11 Dec 2020 18:59:08 +0000"  >&lt;p&gt;It&apos;s currently under development&lt;/p&gt;</comment>
                            <comment id="288083" author="gerrit" created="Sat, 19 Dec 2020 05:25:16 +0000"  >&lt;p&gt;Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/41056&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/41056&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12815&quot; title=&quot;Create multiple TCP sockets per SockLND&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12815&quot;&gt;&lt;del&gt;LU-12815&lt;/del&gt;&lt;/a&gt; socklnd: add conns_per_peer parameter&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: d69314d930b84927bd96351c185b7cef42073d3b&lt;/p&gt;</comment>
                            <comment id="289098" author="gerrit" created="Fri, 8 Jan 2021 22:24:47 +0000"  >&lt;p&gt;James Simmons (jsimmons@infradead.org) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/41181&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/41181&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12815&quot; title=&quot;Create multiple TCP sockets per SockLND&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12815&quot;&gt;&lt;del&gt;LU-12815&lt;/del&gt;&lt;/a&gt; socklnd: add conns_per_peer parameter&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 88dd982736c6e681d395c8d5509f030aec1a3289&lt;/p&gt;</comment>
                            <comment id="289156" author="adilger" created="Mon, 11 Jan 2021 00:39:32 +0000"  >&lt;p&gt;Amir or Serguei, can you please send an email to lustre-discuss (CC lustre-devel) asking if anyone there (or their users) is using the &lt;tt&gt;use_tcp_bonding&lt;/tt&gt; option in production?&lt;/p&gt;</comment>
                            <comment id="289159" author="adilger" created="Mon, 11 Jan 2021 02:54:23 +0000"  >&lt;p&gt;Does it make sense (in a later patch) to dynamically tune the &lt;tt&gt;conns_per_peer&lt;/tt&gt; value depending on the network performance?  It is always better to avoid the need for tuning if possible.&lt;/p&gt;

&lt;p&gt;Is it possible to detect from socklnd what the underlying Ethernet device is (e.g. 100GigE) and set conns_per_peer automatically (either at startup or at runtime) unless it is otherwise specified?&lt;/p&gt;</comment>
                            <comment id="289990" author="adilger" created="Thu, 21 Jan 2021 00:04:29 +0000"  >&lt;p&gt;Running &lt;tt&gt;ethtool&lt;/tt&gt; on the Ethernet device reports the available and current interface speed, so it seems at least possible that we could get this same information in &lt;tt&gt;ksocklnd.c&lt;/tt&gt; to set the default value of &lt;tt&gt;conns_per_peer&lt;/tt&gt; based on the link speed:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# ethtool enp0s3
Settings for enp0s3:
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Speed: 1000Mb/s
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Running &lt;tt&gt;strace ethtool&lt;/tt&gt; shows it is calling &lt;tt&gt;ioctl(SIOCETHTOOL)&lt;/tt&gt;, which also is accessible internally via &lt;tt&gt;dev_ioctl()&lt;/tt&gt;:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
&lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; ethtool_ioctl(struct net *net, struct compat_ifreq __user *ifr32)
{
        :
        ret = dev_ioctl(net, SIOCETHTOOL, &amp;amp;ifr, NULL);

&lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; dev_ioctl(struct net *net, unsigned &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; cmd, struct ifreq *ifr, bool *need_copyout)
{
        :
        &lt;span class=&quot;code-keyword&quot;&gt;case&lt;/span&gt; SIOCETHTOOL:
                dev_load(net, ifr-&amp;gt;ifr_name);
                rtnl_lock();
                ret = dev_ethtool(net, ifr);
                rtnl_unlock();
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;but there may be some more fine-grained method in the kernel to determine the current speed of the interface.&lt;/p&gt;

&lt;p&gt;I&apos;m thinking, based on the stats from Shuici above and &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14293&quot; title=&quot;Poor lnet/ksocklnd(?) performance on 2x100G bonded ethernet&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14293&quot;&gt;&lt;del&gt;LU-14293&lt;/del&gt;&lt;/a&gt;, we want about &lt;tt&gt;conns_per_peer=4-6&lt;/tt&gt; for 100GbE. The following functions provide a reasonable default value for &lt;tt&gt;conns_per_peer&lt;/tt&gt; (calculations done manually):&lt;/p&gt;
&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;Speed&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;&lt;tt&gt;ilog2(Gbps)&lt;/tt&gt;&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;&lt;tt&gt;ilog2(Gbps) / 2&lt;/tt&gt;&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;&lt;tt&gt;(ilog2(Gbps) + 1) / 2&lt;/tt&gt;&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;&lt;tt&gt;ilog2(Gbps / 2)&lt;/tt&gt;&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;&lt;tt&gt;ilog2(Gbps) / 2 + 1&lt;/tt&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&lt;b&gt;1Gbps&lt;/b&gt;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2Gbps&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4Gbps&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;8Gbps&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&lt;b&gt;10Gbps&lt;/b&gt;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;16Gbps&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;32Gbps&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;5&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&lt;b&gt;50Gbps&lt;/b&gt;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;5&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;64Gbps&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;5&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&lt;b&gt;100Gbps&lt;/b&gt;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;5&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;128Gbps&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;7&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;&lt;b&gt;200Gbps&lt;/b&gt;&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;7&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;3&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;256Gbps&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;8&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;4&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;5&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;7&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;I believe &lt;tt&gt;conns_per_peer=0&lt;/tt&gt; and &lt;tt&gt;conns_per_peer=1&lt;/tt&gt; is functionally equivalent. In any case, I think either the third or fourth (&quot;log2(speed in multiples of 2Gbps)&quot;) or fifth formulas (&quot;log4(Gbps)+1&quot;) provide a good starting point. That would give us &lt;tt&gt;conns_per_peer=4/5&lt;/tt&gt; at 100Gbps. We should probably prefer the lower value (&lt;tt&gt;(ilog2(Gbps) + 1) / 2&lt;/tt&gt; or (&lt;tt&gt;ilog2(Gbps) / 2 + 1&lt;/tt&gt;), since we have to balance single-client performance against the number of sockets created to a server. Users can always specify a better value if they have a preference, but typically they will never touch it so better to have &lt;em&gt;something&lt;/em&gt; useful (even if not perfect for every situation) rather than repeated complaints about 100GbE being slow.&lt;/p&gt;</comment>
                            <comment id="290035" author="chunteraa" created="Thu, 21 Jan 2021 15:09:44 +0000"  >&lt;p&gt;The linux kernel implements features to distribute network packets/TCP fragments over multiple CPU cores. Usually distribution is decided via &quot;hash function&quot; based on incoming TCP port and sender IP address.&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href=&quot;https://www.kernel.org/doc/Documentation/networking/scaling.txt&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://www.kernel.org/doc/Documentation/networking/scaling.txt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;In theory a network interface should achieve line rate on a single TCP port with streams from multiple IP addresses.&lt;/p&gt;</comment>
                            <comment id="290041" author="simmonsja" created="Thu, 21 Jan 2021 16:52:46 +0000"  >&lt;p&gt;For ORNL we found conns_per_peer=8 gave the best results for 100Gbps.&lt;/p&gt;</comment>
                            <comment id="290080" author="adilger" created="Thu, 21 Jan 2021 20:53:24 +0000"  >&lt;p&gt;Chris, the problem here is that without &lt;tt&gt;conns_per_peer&lt;/tt&gt; there is only a single port for socklnd on a single IP address. Creating multiple sockets on the client avoids that issue, and is much less complex than creating multiple virtual interfaces.  With multiple clients a server is less likely to have a problem, but if only a single client is reading/writing (which some workloads do), then it would again be limited performance without multiple sockets. &lt;/p&gt;</comment>
                            <comment id="290082" author="adilger" created="Thu, 21 Jan 2021 21:04:25 +0000"  >&lt;p&gt;James, looking at &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14293&quot; title=&quot;Poor lnet/ksocklnd(?) performance on 2x100G bonded ethernet&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14293&quot;&gt;&lt;del&gt;LU-14293&lt;/del&gt;&lt;/a&gt; it seems like the peak performance could be hit with 6 connections, but no results were shown between 4 and 8.  Also note that socklnd creates 3x TCP connections per peer (read, write, small message), to allow different tunings and avoid congestion.&lt;/p&gt;

&lt;p&gt;It might be worthwhile to see if multiple small message connections is useful or not, so that the total connections would be (2x conns_per_peer +1), but that optimization is probably only needed if we start running out of ports (2500 clients with 8x3 connections).  I&apos;m hoping nobody is building a giant cluster with that many Ethernet cards and not using RoCE or similar).&lt;/p&gt;</comment>
                            <comment id="290084" author="simmonsja" created="Thu, 21 Jan 2021 21:14:31 +0000"  >&lt;p&gt;You mean like our next machine &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;</comment>
                            <comment id="290094" author="ssmirnov" created="Thu, 21 Jan 2021 23:47:49 +0000"  >&lt;p&gt;Andreas,&#160;&lt;/p&gt;

&lt;p&gt;To clarify, the socklnd conns_per_peer currently results in 2xconns_per_peer + 1 tcp connections per peer as only bulk_in and bulk_out conn types are multiplied.&#160;&lt;/p&gt;</comment>
                            <comment id="290132" author="chunteraa" created="Fri, 22 Jan 2021 15:44:51 +0000"  >&lt;blockquote&gt;&lt;p&gt;Chris, the problem here is that without conns_per_peer there is only a single port for socklnd on a single IP address. Creating multiple sockets on the client avoids that issue, and is much less complex than creating multiple virtual interfaces. &lt;br/&gt;
 With multiple clients a server is less likely to have a problem, but if only a single client is reading/writing (which some workloads do), then it would again be limited performance without multiple sockets.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Thanks Andreas, as stated in the description this feature is intended for systems with small number of clients. It does not appear to benefit systems at scale.&lt;/p&gt;

&lt;p&gt;Is possible to know ahead of time which TCP ports &lt;tt&gt;conns_per_peer&lt;/tt&gt; will use (ie. to adjust firewalls) ?&lt;/p&gt;

&lt;p&gt;The problem described, insufficient incoming TCP streams to achieve network line rate, is not unique to Lustre.&lt;/p&gt;</comment>
                            <comment id="290208" author="adilger" created="Sat, 23 Jan 2021 11:08:27 +0000"  >&lt;p&gt;The target port for connections will always be the same, 988, which it is for all new connections, and the actually-assigned port is mostly irrelevant.  This is not any different from multiple clients connecting separately.&lt;/p&gt;</comment>
                            <comment id="290252" author="degremoa" created="Mon, 25 Jan 2021 10:20:40 +0000"  >&lt;blockquote&gt;&lt;p&gt;which it is for all new connections, and the actually-assigned port is mostly irrelevant. This is not any different from multiple clients connecting separately.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;If I remember correctly this is not totally true. There is still this rare behavior where a TCP socket is broken for some reasons and the first node to need it is the server, trying to send a ldlm callback. When the server detects the tcp connection for the reverse import is broken, it re-establish it itself, creating a server-&amp;gt;client socket.&lt;/p&gt;

&lt;p&gt;(If the client needs this connection before, (ie: obd_ping), it will re-establish it normally and the server will use this connection as usual). This likely impact the metadata socket, not the bulk I/O sockets.&lt;/p&gt;</comment>
                            <comment id="291559" author="gerrit" created="Tue, 9 Feb 2021 22:02:00 +0000"  >&lt;p&gt;Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/41463&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/41463&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12815&quot; title=&quot;Create multiple TCP sockets per SockLND&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12815&quot;&gt;&lt;del&gt;LU-12815&lt;/del&gt;&lt;/a&gt; socklnd: allow dynamic setting of conns_per_peer&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: fe3db0979f34afc5139fdc1b6b9ab6eace5cfde4&lt;/p&gt;</comment>
                            <comment id="291688" author="adilger" created="Wed, 10 Feb 2021 21:53:17 +0000"  >&lt;blockquote&gt;
&lt;p&gt;There is still this rare behavior where a TCP socket is broken for some reasons and the first node to need it is the server, trying to send a ldlm callback. When the server detects the tcp connection for the reverse import is broken, it re-establish it itself, creating a server-&amp;gt;client socket.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;That is true, but in this case I still believe that the server will use target port 988 on the client (not sure of source port), and the client will need to allow new connections on port 988 for most reliable behavior.  In many cases, it is possible for the client to function properly without allowing &lt;b&gt;any&lt;/b&gt; incoming connections, but as you write there may be rare cases that the server needs to initiate a connection, and without that the client may occasionally be evicted.  For some sites that may be preferable to having an open port in the firewall.  IIRC, there may even be a parameter to disable server-&amp;gt;client connections, but I don&apos;t recall the details.&lt;/p&gt;</comment>
                            <comment id="291718" author="degremoa" created="Thu, 11 Feb 2021 10:27:50 +0000"  >&lt;p&gt;That&apos;s correct. I wanted to warn about this often unknown use case (I was really surprised when I discovered it), with any potential impact of this feature.&lt;/p&gt;

&lt;p&gt;Agreed that some sites could prefer risking evictions than changing firewall rules. However, this is surprising behavior and sites tends to reduce as much as possible evictions and having a &quot;normal case&quot; where evictions will happen but are expected, just because of unknown traffic rules is not the best option. This could cause additional JIRA tickets. But this problem already exists and is independent of this ticket. I just wanted to avoid increasing the issue. Does multiple TCP sockets featurer will only apply to Bulk I/O sockets?&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="291719" author="adilger" created="Thu, 11 Feb 2021 10:46:56 +0000"  >&lt;blockquote&gt;
&lt;p&gt; Does multiple TCP sockets featurer will only apply to Bulk I/O sockets?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Correct, this is only for bulk sockets. &lt;/p&gt;</comment>
                            <comment id="300503" author="gerrit" created="Wed, 5 May 2021 02:49:52 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/41056/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/41056/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12815&quot; title=&quot;Create multiple TCP sockets per SockLND&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12815&quot;&gt;&lt;del&gt;LU-12815&lt;/del&gt;&lt;/a&gt; socklnd: add conns_per_peer parameter&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 71b2476e4ddb95aa42f4a0ea3f23b1826017bfa5&lt;/p&gt;</comment>
                            <comment id="301431" author="adilger" created="Thu, 13 May 2021 02:02:25 +0000"  >&lt;p&gt;There may still be work needed to distribute RPCs from a single client to multiple CPTs on the server, in order to get the best performance for real IO workloads.  &lt;/p&gt;

&lt;p&gt;Otherwise, a client with a single interface (NID) will have all of its RPCs handled by cores in a single CPT, which is not quite the same as having multiple real interfaces on the client.  Discussion is ongoing in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14676&quot; title=&quot;Better hash distribution to different CPTs when LNET router is exist&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14676&quot;&gt;&lt;del&gt;LU-14676&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="308704" author="gerrit" created="Wed, 28 Jul 2021 22:24:10 +0000"  >&lt;p&gt;Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/44417&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/44417&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12815&quot; title=&quot;Create multiple TCP sockets per SockLND&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12815&quot;&gt;&lt;del&gt;LU-12815&lt;/del&gt;&lt;/a&gt; socklnd: set conns_per_peer based on link speed&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: ca2f4fed6d85d2e4506958fcc2e1c6c98eb2d020&lt;/p&gt;</comment>
                            <comment id="310505" author="gerrit" created="Wed, 18 Aug 2021 01:58:07 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/41463/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/41463/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12815&quot; title=&quot;Create multiple TCP sockets per SockLND&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12815&quot;&gt;&lt;del&gt;LU-12815&lt;/del&gt;&lt;/a&gt; socklnd: allow dynamic setting of conns_per_peer&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: a5cbe7883db6d77b82fbd83ad4c662499421d229&lt;/p&gt;</comment>
                            <comment id="310517" author="gerrit" created="Wed, 18 Aug 2021 01:59:51 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/44417/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/44417/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12815&quot; title=&quot;Create multiple TCP sockets per SockLND&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12815&quot;&gt;&lt;del&gt;LU-12815&lt;/del&gt;&lt;/a&gt; socklnd: set conns_per_peer based on link speed&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: c44afcfb72a1c2fd8392bfab3143c3835b146be6&lt;/p&gt;</comment>
                            <comment id="310549" author="pjones" created="Wed, 18 Aug 2021 12:44:11 +0000"  >&lt;p&gt;Looks like everything has landed for 2.15&lt;/p&gt;</comment>
                            <comment id="334109" author="gerrit" created="Mon, 9 May 2022 00:13:19 +0000"  >&lt;p&gt;&quot;Cyril Bordage &amp;lt;cbordage@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47252&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47252&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12815&quot; title=&quot;Create multiple TCP sockets per SockLND&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12815&quot;&gt;&lt;del&gt;LU-12815&lt;/del&gt;&lt;/a&gt; socklnd: add conns_per_peer parameter&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: b370ffba7b778f9b5fee325a8c67228ca2454137&lt;/p&gt;</comment>
                            <comment id="347147" author="gerrit" created="Tue, 20 Sep 2022 03:35:43 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/47252/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47252/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12815&quot; title=&quot;Create multiple TCP sockets per SockLND&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12815&quot;&gt;&lt;del&gt;LU-12815&lt;/del&gt;&lt;/a&gt; socklnd: add conns_per_peer parameter&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: be2c4bb928b5cf6b428d7974e8fd89ea177fa2df&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="62202">LU-14293</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="64088">LU-14676</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00nfr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>