<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:53:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-5715] Reboot hangs due to lustre modules </title>
                <link>https://jira.whamcloud.com/browse/LU-5715</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Hi,&lt;/p&gt;

&lt;p&gt;The reboot gets hanged with lustre 2.5.2 and RHEL 6.5. If i unload the lustre modules using lustre_rmmod before reboot, it works. Appreciate your help here.&lt;/p&gt;

&lt;p&gt;my lustre.conf is as follows.&lt;/p&gt;

&lt;p&gt;options lnet networks=&quot;o2ib0(ib0)&quot;&lt;/p&gt;</description>
                <environment>RHEL 6.5, MLNX_OFED_LINUX-2.2-1.0.1, CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz; Memory: 64GB; Kernel: 2.6.32-431.17.1.el6_lustre.x86_64; Lustre: 2.5.2-2.6.32_431.17.1</environment>
        <key id="26898">LU-5715</key>
            <summary>Reboot hangs due to lustre modules </summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="prabhu.chakra">Chakravarthy N</reporter>
                        <labels>
                    </labels>
                <created>Tue, 7 Oct 2014 21:41:45 +0000</created>
                <updated>Sat, 29 Jan 2022 08:30:44 +0000</updated>
                            <resolved>Sat, 29 Jan 2022 08:30:44 +0000</resolved>
                                    <version>Lustre 2.5.2</version>
                                                        <due></due>
                            <votes>1</votes>
                                    <watches>9</watches>
                                                                            <comments>
                            <comment id="96136" author="green" created="Fri, 10 Oct 2014 17:16:45 +0000"  >&lt;p&gt;Are there any messages in kernel logs?&lt;/p&gt;</comment>
                            <comment id="96137" author="green" created="Fri, 10 Oct 2014 17:17:22 +0000"  >&lt;p&gt;Also is this something you started to experience recently and was all fine with older versions?&lt;/p&gt;</comment>
                            <comment id="96140" author="prabhu.chakra" created="Fri, 10 Oct 2014 17:58:34 +0000"  >&lt;p&gt;There are no messages found in the syslog and everthing was fine until older versions. I dd not face the same issue with RHEL-6.4+Lustre-2.5 or RHEL6.4+Lustre-2.4&lt;/p&gt;</comment>
                            <comment id="118231" author="lana.deere@gmail.com" created="Thu, 11 Jun 2015 17:39:43 +0000"  >&lt;p&gt;I have seen this symptom in CentOS 6.3 with Lustre 2.1.4.  (I don&apos;t have a newer configuration installed to try.)  Lustre is set up using o2ib.  The clients and all Lustre nodes have IPoIB enabled plus an Ethernet connection.  The clients are generally busy full-time, which is to say that when a client shutdown is initiated it is likely that at least some processes have a Lustre directory or file opened (current working directory of a process, if nothing else).&lt;/p&gt;

&lt;p&gt;When the client hangs, there is no overt explanation - nothing in the syslog, etc.  However, using IPMI to watch the client&apos;s virtual console showed that &quot;/etc/init.d/rdma stop&quot; was where the shutdown would hang.  It would print that it was &quot;Unloading OpenIB kernel modules&quot; but it could not succeed because one (or more? I forget) of the OpenIB modules was in use.  It would hang at that point.&lt;/p&gt;

&lt;p&gt;As a hack, it usually prevents the hanging if we change /etc/init.d/rdma so it calls lustre_rmmod; specifically, so that the original line &quot;stop()&quot; becomes&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeHeader panelHeader&quot; style=&quot;border-bottom-width: 1px;&quot;&gt;&lt;b&gt;/etc/init.d/rdma hack&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;stop()
{
[ -x /usr/sbin/lustre_rmmod ] &amp;amp;&amp;amp; /usr/sbin/lustre_rmmod;
real_stop
}
real_stop()
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This may or may not be related, but since it may be the symmetric issue at startup I&apos;ll mention it.  On these clients, mounting the filesystem inside /etc/fstab using&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeHeader panelHeader&quot; style=&quot;border-bottom-width: 1px;&quot;&gt;&lt;b&gt;/etc/fstab&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;&amp;lt;IPoIB address&amp;gt;@o2ib0:/lustre /mnt/lustre lustre defaults,_netdev 0 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;also generally fails: the system thinks the conditions for &quot;_netdev&quot; have been satisfied before ib0 is active so the mount fails.  Stalling the mount one way or another is needed.  (Do it explicitly later in the boot, or modify /etc/init.d/netfs so the check for _netdev waits for ib0, etc.)  &lt;/p&gt;</comment>
                            <comment id="134340" author="kmoran" created="Tue, 24 Nov 2015 02:03:00 +0000"  >&lt;p&gt;I can confirm this is still an issue using RedHat kernel with Lustre client:&lt;/p&gt;

&lt;p&gt;Linux 2.6.32-573.7.1.el6.x86_64 #1 SMP Thu Sep 10 13:42:16 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux&lt;/p&gt;

&lt;p&gt;Using standard RedHat Infiniband Support package.&lt;/p&gt;

&lt;p&gt;Hangs at Unloading IB Modules even though /usr/sbin/lustre_rmmod is being called prior.  &lt;/p&gt;

</comment>
                            <comment id="165832" author="wbaudler" created="Tue, 13 Sep 2016 14:49:50 +0000"  >&lt;p&gt;I can confirm the same issue here with RHEL6.8 and lustre 2.5.3, also using the RedHat Infiniband packages.&lt;/p&gt;</comment>
                            <comment id="229358" author="utopiabound" created="Fri, 8 Jun 2018 17:53:36 +0000"  >&lt;p&gt;This issue is resolved with patches for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8293&quot; title=&quot;lnet init.d script missing insserv header&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8293&quot;&gt;&lt;del&gt;LU-8293&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="37632">LU-8293</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is duplicated by">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10040" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic</customfieldname>
                        <customfieldvalues>
                                        <label>server</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwy1r:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>16029</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>