<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:16:03 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15169] Regression in &quot;024f9303bc LU-14668 lnet: Lock primary NID logic&quot; breaks client mounts</title>
                <link>https://jira.whamcloud.com/browse/LU-15169</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;This commit has caused a serious regression on master where clients are unable to mount a filesystem under certain LNet configurations (namely routed ones):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;commit 024f9303bc6f32a3113357c864765c4f9c93ed03
Author: Amir Shehata &amp;lt;ashehata@whamcloud.com&amp;gt;
Date:   Wed May 5 11:35:06 2021 -0700

    LU-14668 lnet: Lock primary NID logic
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I believe this should be reverted and the patches for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14668&quot; title=&quot;LNet: do discovery in the background&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14668&quot;&gt;&lt;del&gt;LU-14668&lt;/del&gt;&lt;/a&gt; (which is still open) should be re-worked.&lt;/p&gt;

&lt;p&gt;Some additional detail on the bug:&lt;/p&gt;

&lt;p&gt;The aforementioned commit will break any routed configuration where the clients mount the filesystem using non-primary NIDs. For example:&lt;/p&gt;

&lt;p&gt;MGS&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;10.16.100.52@o2ib
10.16.100.53@o2ib
10.16.100.52@o2ib10
10.16.100.53@o2ib10
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Clients have routes to the o2ib10 network, so they mount using something like:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;mount -t lustre 10.16.100.52@o2ib10,10.16.100.53@o2ib10:/lustre ...
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;LNetPrimaryNID() on the client returns 10.16.100.52@o2ib10 as the primary NID (because of &lt;a href=&quot;https://review.whamcloud.com/43563/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/43563/&lt;/a&gt; ), so client sets up ptlrpc connection using this NID. But incoming messages from the MGS have the actual primary NID, 10.16.100.52@o2ib. So they do not match and the incoming messages get dropped. This prevents the client from being able to mount.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;walleye-p5:~ # !grep
grep lustre /etc/fstab
10.16.100.52@o2ib10,10.16.100.53@o2ib10:10.16.100.54@o2ib11,10.16.100.55@o2ib11:/kjcf05 /lus/kjcf05 lustre rw,flock,lazystatfs,noauto 0 0
walleye-p5:~ # mount /lus/kjcf05
mount.lustre: mount 10.16.100.52@o2ib10,10.16.100.53@o2ib10:10.16.100.54@o2ib11,10.16.100.55@o2ib11:/kjcf05 at /lus/kjcf05 failed: Input/output error
Is the MGS running?
walleye-p5:~ #
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If I revert &lt;a href=&quot;https://review.whamcloud.com/43563&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/43563&lt;/a&gt; then I&apos;m able to mount:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;walleye-p5:~ # mount /lus/kjcf05
walleye-p5:~ # lfs check servers
kjcf05-OST0000-osc-ffff8888361cd000 active.
kjcf05-OST0001-osc-ffff8888361cd000 active.
kjcf05-OST0002-osc-ffff8888361cd000 active.
kjcf05-OST0003-osc-ffff8888361cd000 active.
kjcf05-MDT0000-mdc-ffff8888361cd000 active.
kjcf05-MDT0001-mdc-ffff8888361cd000 active.
MGC10.16.100.52@o2ib10 active.
walleye-p5:~ #
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I think the regression doesn&apos;t strictly apply to routed configurations, but any client mount where the client&apos;s initial connection attempt goes to a non-primary NID. This would be typical for routed clients. Not so much with direct connect, but it is possible there too (like with multi-homed servers)&lt;/p&gt;</description>
                <environment></environment>
        <key id="66857">LU-15169</key>
            <summary>Regression in &quot;024f9303bc LU-14668 lnet: Lock primary NID logic&quot; breaks client mounts</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="hornc">Chris Horn</assignee>
                                    <reporter username="hornc">Chris Horn</reporter>
                        <labels>
                    </labels>
                <created>Wed, 27 Oct 2021 17:16:50 +0000</created>
                <updated>Wed, 22 Dec 2021 14:50:04 +0000</updated>
                            <resolved>Wed, 22 Dec 2021 14:50:04 +0000</resolved>
                                    <version>Lustre 2.15.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="316723" author="gerrit" created="Wed, 27 Oct 2021 17:24:23 +0000"  >&lt;p&gt;&quot;Chris Horn &amp;lt;chris.horn@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/45386&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45386&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15169&quot; title=&quot;Regression in &amp;quot;024f9303bc LU-14668 lnet: Lock primary NID logic&amp;quot; breaks client mounts&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15169&quot;&gt;&lt;del&gt;LU-15169&lt;/del&gt;&lt;/a&gt; Revert &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14668&quot; title=&quot;LNet: do discovery in the background&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14668&quot;&gt;&lt;del&gt;LU-14668&lt;/del&gt;&lt;/a&gt; lnet: Lock primary NID logic&quot;&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 0ce961a1eaba54a90a87d751e7620d18c58b3e1c&lt;/p&gt;</comment>
                            <comment id="320704" author="gerrit" created="Mon, 13 Dec 2021 03:51:45 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/45386/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45386/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15169&quot; title=&quot;Regression in &amp;quot;024f9303bc LU-14668 lnet: Lock primary NID logic&amp;quot; breaks client mounts&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15169&quot;&gt;&lt;del&gt;LU-15169&lt;/del&gt;&lt;/a&gt; Revert &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14668&quot; title=&quot;LNet: do discovery in the background&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14668&quot;&gt;&lt;del&gt;LU-14668&lt;/del&gt;&lt;/a&gt; lnet: Lock primary NID logic&quot;&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: f2f168e3daf12850f40f991d74e04eb283c2376f&lt;/p&gt;</comment>
                            <comment id="321362" author="pjones" created="Wed, 22 Dec 2021 14:50:04 +0000"  >&lt;p&gt;IIUC this issue will have been resolved by the revert of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14668&quot; title=&quot;LNet: do discovery in the background&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14668&quot;&gt;&lt;del&gt;LU-14668&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="64039">LU-14668</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i028in:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>