<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:16:09 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1384] MDS Kernel Panic when trying to mount the lustre file system</title>
                <link>https://jira.whamcloud.com/browse/LU-1384</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;After the mkfs of all the FS I was able to mount it, and do a simple &apos;dd&apos; to create few files. Once that I mount it on 12 client with lustre 1.8.4 and trying to make IOR benchmark, using 2 nodes for a total of 12 cores the file system immediately hang and the MDS01 had a kernel panic, as follow:&lt;br/&gt;
Message from syslogd@mds01 at May  8 12:00:59 ...&lt;br/&gt;
 kernel:LustreError: 3523:0:(mdd_object.c:635:mdd_big_lmm_get()) ASSERTION( ma-&amp;gt;ma_lmm_size &amp;gt; 0 ) failed: &lt;/p&gt;

&lt;p&gt;Message from syslogd@mds01 at May  8 12:00:59 ...&lt;br/&gt;
 kernel:LustreError: 3523:0:(mdd_object.c:635:mdd_big_lmm_get()) LBUG&lt;br/&gt;
Write failed: Broken pipe&lt;/p&gt;


&lt;p&gt;The heartbeat tried to takeover but immediately had kernel panic too:&lt;/p&gt;


&lt;p&gt;Message from syslogd@mds02 at May  8 12:04:05 ...&lt;br/&gt;
 kernel:LustreError: 3657:0:(mdd_object.c:635:mdd_big_lmm_get()) ASSERTION( ma-&amp;gt;ma_lmm_size &amp;gt; 0 ) failed: &lt;/p&gt;

&lt;p&gt;Message from syslogd@mds02 at May  8 12:04:05 ...&lt;br/&gt;
 kernel:LustreError: 3657:0:(mdd_object.c:635:mdd_big_lmm_get()) LBUG&lt;br/&gt;
Write failed: Broken pipe&lt;/p&gt;

&lt;p&gt;To make the file system I did as the attached file weisshorn_mkfs.sh&lt;/p&gt;

&lt;p&gt;The SSD Lun is built on a LSI SSD controller with RAID10.&lt;/p&gt;

&lt;p&gt;Any suggestions or input that I can try to fix the problem?&lt;br/&gt;
Attached also the /var/log/messages with the kernel messages.&lt;/p&gt;</description>
                <environment>Linux  2.6.32-220.7.1.el6_lustre.g9c8f747.x86_64 #1 SMP Tue Apr 24 14:27:35 PDT 2012 x86_64 x86_64 x86_64 GNU/Linux&lt;br/&gt;
14 Servers Total&lt;br/&gt;
1 MDS + 1 Fail Over ( 300 GB )  AMD Opteron(tm) Processor 6128&lt;br/&gt;
12 OSS ( with fail over per each couple )  Sandy Bridge &lt;br/&gt;
6 OST per OSS ( 7 TB )</environment>
        <key id="14322">LU-1384</key>
            <summary>MDS Kernel Panic when trying to mount the lustre file system</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="fverzell">Fabio Verzelloni</reporter>
                        <labels>
                    </labels>
                <created>Tue, 8 May 2012 08:05:06 +0000</created>
                <updated>Fri, 1 Jun 2012 14:45:24 +0000</updated>
                            <resolved>Fri, 1 Jun 2012 14:45:24 +0000</resolved>
                                    <version>Lustre 2.2.0</version>
                    <version>Lustre 2.3.0</version>
                                    <fixVersion>Lustre 2.3.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>1</watches>
                                                                            <comments>
                            <comment id="38298" author="fverzell" created="Tue, 8 May 2012 08:40:25 +0000"  >&lt;p&gt;That&apos;s the moment of the kernel panic as soon as I mounted the lustre FS on the client with 1.8.4&lt;/p&gt;</comment>
                            <comment id="38302" author="fverzell" created="Tue, 8 May 2012 09:36:23 +0000"  >&lt;p&gt;The version of lustre on the client side which are killing the MDS are:&lt;/p&gt;

&lt;p&gt;lustre-modules-1.8.4-2.6.32.36_0.5_default_201202291115&lt;br/&gt;
lustre-client-source-1.8.4-2.6.27_39_0.3_lustre.1.8.4_default&lt;br/&gt;
lustre-1.8.4-2.6.32.36_0.5_default_201202291115&lt;/p&gt;

&lt;p&gt;cray-liblustreconfig0-1.0-1.0400.30000.6.18.gem&lt;br/&gt;
cray-lustre-utils-2.3-1.0400.29861.8.1.gem&lt;br/&gt;
cray-lustre-cray_gem_s-1.8.4_2.6.32.45_0.3.2_1.0400.6221.1.1-1.0400.30252.1.29&lt;br/&gt;
cray-lustre-cray_gem_s-1.8.4_2.6.32.45_0.3.2_1.0400.6221.1.1-1.0400.31443.0.0&lt;/p&gt;
</comment>
                            <comment id="38569" author="pjones" created="Thu, 10 May 2012 16:55:09 +0000"  >&lt;p&gt;Lai&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="38571" author="adilger" created="Thu, 10 May 2012 17:01:07 +0000"  >&lt;p&gt;As a starting point, the client should never be able to crash the MDS.  The MDS code needs to be updated to validate the incoming data and return an error if it is wrong.&lt;/p&gt;

&lt;p&gt;A separate case is that the 1.8.4 client will not work correctly with a 2.x server without several patches being applied.&lt;/p&gt;</comment>
                            <comment id="39363" author="pjones" created="Thu, 24 May 2012 23:18:06 +0000"  >&lt;p&gt;Bobijam&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="39373" author="bobijam" created="Fri, 25 May 2012 01:24:37 +0000"  >&lt;p&gt;patch tracking at &lt;a href=&quot;http://review.whamcloud.com/2905&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/2905&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1384&quot; title=&quot;MDS Kernel Panic when trying to mount the lustre file system&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1384&quot;&gt;&lt;del&gt;LU-1384&lt;/del&gt;&lt;/a&gt; mdd: validate incoming param to avoid crashing&lt;/p&gt;

&lt;p&gt;MDS get crashed when it is connected by unsupported 1.8.x client,&lt;br/&gt;
the crash point is&lt;/p&gt;

&lt;p&gt;kernel:LustreError: 3657:0:(mdd_object.c:635:mdd_big_lmm_get())&lt;br/&gt;
                        ASSERTION( ma-&amp;gt;ma_lmm_size &amp;gt; 0 ) failed&lt;/p&gt;

&lt;p&gt;We need validate the incoming @ma lest old client crash the MDS.&lt;/p&gt;</comment>
                            <comment id="39839" author="pjones" created="Fri, 1 Jun 2012 14:45:24 +0000"  >&lt;p&gt;Landed for 2.3&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="11310" name="error5.png" size="10841" author="fverzell" created="Tue, 8 May 2012 08:50:38 +0000"/>
                            <attachment id="11309" name="kernel_panic.png" size="11176" author="fverzell" created="Tue, 8 May 2012 08:40:25 +0000"/>
                            <attachment id="11316" name="kernel_panic2.png" size="17798" author="fverzell" created="Tue, 8 May 2012 09:05:14 +0000"/>
                            <attachment id="11317" name="kernel_panic3.png" size="29166" author="fverzell" created="Tue, 8 May 2012 09:05:14 +0000"/>
                            <attachment id="11311" name="lfs_check_servers.log" size="5877" author="fverzell" created="Tue, 8 May 2012 08:57:08 +0000"/>
                            <attachment id="11308" name="messages1" size="527728" author="fverzell" created="Tue, 8 May 2012 08:05:06 +0000"/>
                            <attachment id="11307" name="weisshorn_mkfs.sh" size="12076" author="fverzell" created="Tue, 8 May 2012 08:05:06 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzv6lz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4605</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>