<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:13:08 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-14827] Allow Lnet peer entries to be updated if peer&apos;s NIDs change</title>
                <link>https://jira.whamcloud.com/browse/LU-14827</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;My knowledge about Lustre is limited so please correct me where necessary.&lt;/p&gt;


&lt;p&gt;Imagine the following situation: You have a Lustre (2.14) file system running and the clients can access Lustre.&lt;/p&gt;

&lt;p&gt;Now you want to resolve some issues with startup order on the clients. And doing so you get the order wrong in which lnet and lustre modules are loaded and configured. In my particular case the lustre module was loaded before Lnet configuration for Infiniband was done so the lustre module configured an Lnet on ethernet, yet there is no connection between client and Lustre server ethernet.&lt;/p&gt;

&lt;p&gt;This resulted in having two NIs configured (@tcp and @o2ib) per client where @tcp is the primary NID. The Lustre servers will happily accept these peer configurations but Lustre operation gets slower because the servers will try to reach the clients via @tcp first&lt;br/&gt;
 (and vice versa).&lt;/p&gt;

&lt;p&gt;Having spotted that mistake and corrected the order in which Lnet is configured and the Lustre module is loaded the clients then only get one NI configured (@o2ib) which naturally is the primary NID. But the Lustre servers do not update the Lnet peer entries already discovered and keep a primary NID of @tcp for the clients. And thus the servers will try to connect to the clients using @tcp.&lt;/p&gt;


&lt;p&gt;A fools resolution would just remove the peer entry on the Lustre servers and instantly add back a correct entry. But this leads to hiccups that influence the whole file system, possibly leading to reboots of the Lustre servers.&lt;/p&gt;


&lt;p&gt;So the solutions to this situation that I can think of are:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;allow lnetctl to remove primary NIDs from peer entries&lt;/li&gt;
	&lt;li&gt;dynamically update a peer entry if a peer reconnects with a different configuration&lt;/li&gt;
&lt;/ul&gt;



&lt;p&gt;Is one or the other possible or is a primary NID more than just &quot;the first interface for that peer&quot;?&lt;/p&gt;

&lt;p&gt;Is there another way to remove wrong entries in a Lustre server&apos;s peer configuration (other than rebooting)?&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;Thanks,&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;Uwe&lt;/p&gt;</description>
                <environment></environment>
        <key id="65016">LU-14827</key>
            <summary>Allow Lnet peer entries to be updated if peer&apos;s NIDs change</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="6">Not a Bug</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="uwe.sauter">Uwe Sauter</reporter>
                        <labels>
                    </labels>
                <created>Wed, 7 Jul 2021 12:11:35 +0000</created>
                <updated>Wed, 7 Jul 2021 14:23:09 +0000</updated>
                            <resolved>Wed, 7 Jul 2021 14:23:09 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="306444" author="pjones" created="Wed, 7 Jul 2021 12:15:40 +0000"  >&lt;p&gt;As none of the cloud offerings that use this project use Lustre 2.14 I am guessing that this issue is intended to be in the LU project and will move it accordingly&lt;/p&gt;</comment>
                            <comment id="306445" author="uwe.sauter" created="Wed, 7 Jul 2021 12:22:58 +0000"  >&lt;p&gt;Yes, that was my mistake while creating the ticket. Thank you.&lt;/p&gt;</comment>
                            <comment id="306459" author="uwe.sauter" created="Wed, 7 Jul 2021 13:36:44 +0000"  >&lt;p&gt;I also must correct my assumption that I was using 2.14, actually this was 2.12.6 modified by DDN.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="306464" author="pjones" created="Wed, 7 Jul 2021 14:23:09 +0000"  >&lt;p&gt;Then please open a ticket through DDN support channels. If the bug affects the community releases then the fix will be upstreamed in due course.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i01yon:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>