<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:08:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-14338] prevent duplicate client mounts and server connections</title>
                <link>https://jira.whamcloud.com/browse/LU-14338</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The lustre client mount process should check whether another mount was already initiated, but not yet completed, for the specified filesystem name and mountpoint.  If found, it should exit with a clear error message.  It may help system administrators to debug mount problems if it also prints the duration of the prior attempt, in seconds.&lt;/p&gt;

&lt;p&gt;The simple check of /etc/mtab leaves too large an opportunity for race condition, but the error message is helpful and can be used as an example:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
# mount -t lustre 192.168.1.101@o2ib:192.168.1.102@o2ib:/lfstmp /mnt/lfstmp
mount.lustre: according to /etc/mtab 192.168.1.102@o2ib:192.168.1.101@o2ib:/lfstmp is already mounted on /mnt/lfstmp
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So the new error message could look something like:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
mount.lustre: $SERVER:/$FS is already mounting on $MNTPNT in progress &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; $SEC seconds
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The problem can be &quot;duplicated&quot; fairly consistently on a client with:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
# mount -t lustre 192.168.1.102@o2ib:/lfstmp /mnt/lfstmp &amp;amp; mount -t lustre 192.168.1.102@o2ib:/lfstmp /mnt/lfstmp
# mount -t lustre
192.168.1.102@o2ib:/lfstmp on /mnt/lfstmp type lustre (rw,flock,lazystatfs)
192.168.1.102@o2ib:/lfstmp on /mnt/lfstmp type lustre (rw,flock,lazystatfs)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I can think of at least 3 scenarios which would exacerbate the problem by extending the window where this can occur...&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Lustre modules are not yet loaded, so the first mount takes a second or two longer as it triggers the load and possibly even lnet initialization&lt;/li&gt;
	&lt;li&gt;A network problem causes the first mount to hang, and an admin or self healing script tries again, then both mounts proceed when the network connectivity is restored.  The &quot;network problem&quot; can be as simple as a client IB port not yet initialized on boot.&lt;/li&gt;
	&lt;li&gt;In a failover configuration, the primary MGS NID is not available, so a first mount will hang until it proceeds to the secondary MGS NID after a timeout.  This last case leads to a most reliable reproducer with a longer window for the race condition:
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
# mount -t lustre 192.168.1.101@o2ib:192.168.1.102@o2ib:/lfstmp /mnt/lfstmp &amp;amp; sleep 20; mount -t lustre 192.168.1.102@o2ib:192.168.1.101@o2ib:/lfstmp /mnt/lfstmp; wait; mount -t lustre
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;In some (all?) cases, the duplicate mounts will be accompanied by multiple connections to each storage target, as shown by &apos;lctl dl&apos;.&lt;/p&gt;

&lt;p&gt;There are many possible reasons why mount would be run twice in the real world, and although a clean client configuration and good sysadmin practices may avoid them, Lustre itself should handle the problems gracefully with a simple error message, and prevent unnecessary connections to the servers.&lt;/p&gt;</description>
                <environment>CentOS-7.7, lustre-2.12.5_ddn14</environment>
        <key id="62366">LU-14338</key>
            <summary>prevent duplicate client mounts and server connections</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="dauchy">Nathan Dauchy</reporter>
                        <labels>
                    </labels>
                <created>Mon, 18 Jan 2021 12:25:23 +0000</created>
                <updated>Mon, 18 Jan 2021 12:25:23 +0000</updated>
                                            <version>Lustre 2.12.5</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>1</watches>
                                                                                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i01jgv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>