<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:52:43 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12453] ko2iblnd: problem handling link failures on bonded interfaces</title>
                <link>https://jira.whamcloud.com/browse/LU-12453</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;&#160;&lt;/p&gt;


&lt;p&gt;We have encountered a problem when running RoCEv2 over bonded interfaces.&lt;br/&gt;
When bond0 interface is created on top of two slave interfaces beloging to separate HCAs and primary interface fails&lt;br/&gt;
after RDMA QPs are created LNET connection is not properly re-established using the backup link.&lt;/p&gt;

&lt;p&gt;In such case only solution is to reenable/fix primary interface or restart lnet by reloading kernel modules.&lt;/p&gt;

&lt;p&gt;Problem has been seen on ES7990 as well as in vanilla lustre 2.10.*&lt;/p&gt;

&lt;p&gt;Normaly when bonding is created on top of two ports belonging to the same HCA - mlx driver is handling link failure by moving QPs. In case described above link failure must be handled in ko2iblnd driver.&lt;/p&gt;

&lt;p&gt;Log message related to the described bug is logged when problem occurs:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
e0-oss03 kernel: LNetError: 4598:0:(o2iblnd.c:831:kiblnd_create_conn()) cmid HCA(mlx5_0), kib_dev(bond0.881) need failover&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</description>
                <environment>RDMA over Ethernet:&lt;br/&gt;
&amp;nbsp;- Mellanox ConnectX-5 adapters , ES7990&lt;br/&gt;
&amp;nbsp;- o2iblnd(bond0.881)&lt;br/&gt;
&amp;nbsp;- bond0.881: mlx5_0 + mlx5_1 (interfaces from separate HCAs)&lt;br/&gt;
&amp;nbsp;- bond0 type is active-passive&lt;br/&gt;
</environment>
        <key id="55989">LU-12453</key>
            <summary>ko2iblnd: problem handling link failures on bonded interfaces</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="lflis">Lukasz Flis</reporter>
                        <labels>
                    </labels>
                <created>Wed, 19 Jun 2019 10:26:10 +0000</created>
                <updated>Thu, 3 Oct 2019 03:13:25 +0000</updated>
                                            <version>Lustre 2.10.8</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00ig7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>