<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:05:24 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-270] LDisk-fs warning (device md30): ldisk_multi_mount_protect: fsck is running on filesystem</title>
                <link>https://jira.whamcloud.com/browse/LU-270</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>
&lt;p&gt;OST 10 /dev/md30 resident on OSS3&lt;br/&gt;
From /var/log/messages&lt;br/&gt;
LDisk-fs warning (device md30): ldisk_multi_mount_protect: fsck is running on filesystem&lt;br/&gt;
LDisk-fs warning (device md30): ldisk_multi_mount_protect: MMP failure info: &amp;lt;time in unix seconds&amp;gt;, last update node: OSS3, last update device /dev/md30&lt;/p&gt;

&lt;p&gt;This is a scenario that keeps sending the customer in circles. They know for certain that an fsck is not running. Since they know that they can try to turn the mmp bit off vi the following commands:&lt;/p&gt;

&lt;p&gt;To manually disable MMP, run:&lt;br/&gt;
tune2fs -O ^mmp &amp;lt;device&amp;gt;&lt;br/&gt;
To manually enable MMP, run:&lt;br/&gt;
tune2fs -O mmp &amp;lt;device&amp;gt;&lt;/p&gt;

&lt;p&gt;These commands fail saying that valid superblock does not exist, but they can see their valid superblock (with mmp set) by running the following command:&lt;/p&gt;

&lt;p&gt;Tune2fs -l /dev/md30&lt;/p&gt;

&lt;p&gt;It is their understanding that a fix for this issue was released with a later version of Lustre, but aside from that, is there a way to do this?&lt;/p&gt;

&lt;p&gt;Customer contact is tyler.s.wiegers@lmco.com&lt;/p&gt;
</description>
                <environment>RHEL 5.5 and Lustre 1.8.0.1 on J4400&amp;#39;s</environment>
        <key id="10698">LU-270</key>
            <summary>LDisk-fs warning (device md30): ldisk_multi_mount_protect: fsck is running on filesystem</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="6" iconUrl="https://jira.whamcloud.com/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="adilger">Andreas Dilger</assignee>
                                    <reporter username="dferber">Dan Ferber</reporter>
                        <labels>
                    </labels>
                <created>Tue, 3 May 2011 11:31:53 +0000</created>
                <updated>Wed, 26 Oct 2011 19:54:22 +0000</updated>
                            <resolved>Mon, 9 May 2011 07:48:21 +0000</resolved>
                                    <version>Lustre 1.8.6</version>
                                    <fixVersion>Lustre 1.8.6</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="13585" author="adilger" created="Tue, 3 May 2011 12:44:54 +0000"  >&lt;p&gt;What version of e2fsprogs is being used here?&lt;/p&gt;</comment>
                            <comment id="13587" author="tyler.s.wiegers@lmco.com" created="Tue, 3 May 2011 12:57:42 +0000"  >&lt;ol&gt;
	&lt;li&gt;rpm -qa |grep e2fsprogs&lt;br/&gt;
e2fsprogs-libs-1.39.20.el5&lt;br/&gt;
e2fsprogs-1.40.11.sun1-0redhat&lt;br/&gt;
e2fsprogs-1.39-20.el5&lt;/li&gt;
&lt;/ol&gt;
</comment>
                            <comment id="13588" author="tyler.s.wiegers@lmco.com" created="Tue, 3 May 2011 13:07:53 +0000"  >&lt;p&gt;The specific error output that we see while mounting this OST is the following:&lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;mount -t lustre /dev/md30 /mnt/lustre_ost10&lt;br/&gt;
mount.lustre: mount /dev/md30 at /mnt/lustre_ost10 failed: Invalid argument&lt;br/&gt;
This may have multiple causes.&lt;br/&gt;
Are the mount options correct?&lt;br/&gt;
Check the syslog for more info.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;From messages:&lt;/p&gt;

&lt;p&gt;LDISKFS-fs warning (device md30): ldiskfs_multi_mount_project: fsck is running on the filesystem&lt;br/&gt;
LDISKFS-fs warning (device md30): ldiskfs_multi_mount_protect: MMP failure info: last update time: 1304099783, last update node: oss3, last update device: /dev/md30&lt;br/&gt;
LustreError: 14496:0:(obd_mount.c:1278:server_kernel_mount()) premount /dev/md30:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available?&lt;br/&gt;
LustreError: 14496:0:(obd_mount.c:1278:server_kernel_mount()) Skipped 3 previous similar messages&lt;br/&gt;
LustreError: 14496:0:(obd_mount.c:1590:server_fill_super()) Unable to mount device /dev/md30: -22&lt;br/&gt;
LustreError: 14496:0:(obd_mount.c:1993:lustre_fill_super()) Unable to mount (-22)&lt;/p&gt;


&lt;p&gt;We believe that this OST may have been put in this state because we attempted to run e2fsck (messages output recommended running e2fsck against the ost).  The fsck crashed out when we tried running it which we think caused the OST to enter this state.&lt;/p&gt;

&lt;p&gt;We tried turning off the MMP feature in order to mount the OST, however when attempting the turn off MMP we got the same &quot;fsck is running&quot; error.  We understand that there is a command in a newer version of the e2fsprogs which would clear the MMP flag which &lt;b&gt;might&lt;/b&gt; help, but we don&apos;t have the luxury of blindly updating our systems hoping that it may help.&lt;/p&gt;

&lt;p&gt;We appreciate your support!&lt;/p&gt;</comment>
                            <comment id="13589" author="dferber" created="Tue, 3 May 2011 13:14:40 +0000"  >&lt;p&gt;The customer currently is not undertaking any rebuild activity other than last night&#8217;s CAM and firmware upgrades for the HW, hoping to first have a recommendation on the way forward from Whamcloud from a Lustre perspective. It has definitely been discussed internally on whether they should run with 1.8.0.1 with any identified/recommended patches, or upgrade completely to 1.8.5, or something else.  What are Whamcloud&apos;s recommendations there?&lt;/p&gt;

&lt;p&gt;From the customer&apos;s perspective, the HW has come back clean (CAM and firmware upgrades succeeded), so now they need some help in looking at their configuration, implementation, or Lustre itself.&lt;/p&gt;</comment>
                            <comment id="13593" author="johann" created="Tue, 3 May 2011 13:31:46 +0000"  >&lt;p&gt;&amp;gt; We understand that there is a command in a newer version of the e2fsprogs&lt;br/&gt;
&amp;gt; which would clear the MMP flag which might help&lt;/p&gt;

&lt;p&gt;Right, that&apos;s &quot;tune2fs -f -E clear-mmp $dev&quot;. However 1.40.11-sun1 does not support this option it seems:&lt;br/&gt;
&lt;a href=&quot;http://lists.lustre.org/pipermail/lustre-discuss/2010-August/013818.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://lists.lustre.org/pipermail/lustre-discuss/2010-August/013818.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&amp;gt; but we don&apos;t have the luxury of blindly updating our systems hoping that it may help.&lt;/p&gt;

&lt;p&gt;I really think you should upgrade e2fsprogs since many MMP bugs have been fixed since.&lt;/p&gt;</comment>
                            <comment id="13600" author="tyler.s.wiegers@lmco.com" created="Tue, 3 May 2011 13:50:33 +0000"  >&lt;p&gt;Is there any way to manually recover an OST when the MMP flag is set other than using the tune2fs command?&lt;/p&gt;</comment>
                            <comment id="13606" author="cliffw" created="Tue, 3 May 2011 14:07:59 +0000"  >&lt;p&gt;The tune2fs command is the only way to reset that flag. E2fsprogs is very safe to upgrade, there&lt;br/&gt;
is always complete backward compatibility, this is not a &apos;blind upgrade which might help&apos; - it&apos;s a necessary &lt;br/&gt;
upgrade of the utility that exists to fix exactly your issues. This relates to both problems&lt;br/&gt;
you have reported. &lt;/p&gt;</comment>
                            <comment id="13622" author="tyler.s.wiegers@lmco.com" created="Tue, 3 May 2011 15:51:07 +0000"  >&lt;p&gt;We&apos;re doing an emergency review board to approve installation of this package.  We will have that installed tonight.&lt;/p&gt;


&lt;p&gt;Pending that, we started having a third issue mounting an OST this morning (after updating disk firmware and CAM software).  The error logs are below, if this needs a new bug report then that&apos;s fine, otherwise any comments would be appreciated.&lt;/p&gt;


&lt;p&gt;oss4# mount -t lustre /dev/md11 /mnt/lustre_ost03&lt;br/&gt;
mount.lustre: mount /dev/md11 at /mnt/lustre_ost03 failed: Invalid argument&lt;br/&gt;
This may have multiple causes.&lt;br/&gt;
Are the mount options correct?&lt;br/&gt;
Check the syslog for more info. &lt;/p&gt;

&lt;p&gt;From messages:&lt;/p&gt;

&lt;p&gt;Kjournald starting.  Commit interval 5 seconds&lt;br/&gt;
LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended&lt;br/&gt;
LDISKFS FS on md11, external journal on md13&lt;br/&gt;
LDISKFS-fs: mounted filesystem with ordered data mode.&lt;br/&gt;
LustreError: 25721:0:(obdmount.c:272:ldd_parse()) disk data size does not match: see 0 expect 12288&lt;br/&gt;
LustreError: 25721:0:(obd_mount.c:1292:server_kernel_mount()) premount parse options failed: rc = -22&lt;br/&gt;
LustreError: 25721:0:(obd_mount.c:1590:server_fill_super()) Unable to mount device -22&lt;br/&gt;
LustreError: 25721:0:(obd_mount.c:1993:server_fill_super()) Unable to mount (-22)&lt;/p&gt;


&lt;p&gt;We&apos;ve tried running e2fsck with no success, e2fsck doesn&apos;t report any errors.&lt;/p&gt;</comment>
                            <comment id="13625" author="tyler.s.wiegers@lmco.com" created="Tue, 3 May 2011 17:03:29 +0000"  >&lt;p&gt;We upgraded the e2fsprogs package, ran tune2fs with the clear-mmp option, ran e2fsck on that device, and were able to mount the OST.  Good news there.&lt;/p&gt;

&lt;p&gt;For the previous comment, we are running an e2fsck and checking out what new tune2fs options there are, I&apos;ll post back when we have some new information, but indications at the moment is that ost03 still won&apos;t mount.&lt;/p&gt;</comment>
                            <comment id="13627" author="cliffw" created="Tue, 3 May 2011 19:14:56 +0000"  >&lt;p&gt;It would be best to open up a new bug, It is not good that you are having all these errors after your firmware upgrade. &lt;br/&gt;
It would be a good idea to run fsck -fn on all your disks, see if you have any other issues. &lt;/p&gt;</comment>
                            <comment id="13628" author="samb" created="Tue, 3 May 2011 21:02:14 +0000"  >&lt;p&gt;Regarding the CAMs and drive upgrades, we have seen the corrupted OSTs before on the Riverwalks (J4400&apos;s) when disk firmware was upgraded without both Lustre and the md software raid shutdown cleanly first. Is there any chance that this particular OST10 was not cleanly shutdown? We saw many cases of software RAID corruption on the J4400&apos;s a couple of years ago, which was about the time early versions of 1.8 started to be used. There were several software RAID corruption bugs that have since been fixed. Also, we have fixed many problems since the early 1.8 releases in Lustre, so would encourage an upgrade to 1.8.5 at your earliest convenience. &lt;/p&gt;

&lt;p&gt;If both Lustre and the MD device were shutdown cleanly, then there should have been no problems like this. So, in that case, this would likely be a new bug that potentially still exists in the latest releases of Lustre.&lt;/p&gt;
</comment>
                            <comment id="13630" author="adilger" created="Tue, 3 May 2011 22:03:56 +0000"  >&lt;p&gt;&amp;gt; LustreError: 25721:0:(obdmount.c:272:ldd_parse()) disk data size does not match: see 0 expect 12288&lt;/p&gt;

&lt;p&gt;This indicates that the CONFIGS/mountdata file is also corrupted (zero length file).  It is possible to reconstruct this file by copying it from another OST and (unfortunately) binary editing the file.  There are two fields that are unique to each OST that need to be modified.&lt;/p&gt;

&lt;p&gt;First, on an OSS node make a copy of this file from a working OST, say OST0001:&lt;/p&gt;

&lt;p&gt;OSS# debugfs -c -R &quot;dump CONFIGS/mountdata /tmp/mountdata.ost01&quot; &lt;/p&gt;
{OST0001_dev}

&lt;p&gt;Now the mountdata.ost01 file needs to be edited to reflect that it is being used for OST0003.  If you have a favorite binary editor that could be used.  I use &quot;xxd&quot; from the &quot;vim-common&quot; package to convert it into ASCII to be edited, and then convert it back to binary.&lt;/p&gt;

&lt;p&gt;The important parts of the file are all at the beginning, the rest of the file is common to all OSTs:&lt;/p&gt;

&lt;p&gt;OSS# xxd /tmp/mountdata.ost01 /tmp/mountdata.ost01.asc&lt;br/&gt;
OSS# vi /tmp/mountdata.ost01.asc&lt;/p&gt;

&lt;p&gt;0000000: 0100 d01d 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000010: 0200 0000 0200 0000 0100 0000 0100 0000  ................&lt;br/&gt;
0000020: 6c75 7374 7265 0000 0000 0000 0000 0000  lustre..........&lt;br/&gt;
0000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000060: 6c75 7374 7265 2d4f 5354 3030 3031 0000  lustre-OST0001..&lt;br/&gt;
0000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;snip&amp;#93;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;This is the &quot;xxd&quot; output showing a struct lustre_disk_data.  The two fields that need to be edited are 0x0018 (ldd_svindex) and 0x0060 (ldd_svname).&lt;/p&gt;

&lt;p&gt;Edit the &quot;0100&quot; in the second row, fifth column to be &quot;0300&quot;.&lt;br/&gt;
Edit the &quot;OST0001&quot; line to be &quot;OST0003&quot;:&lt;/p&gt;

&lt;p&gt;0000000: 0100 d01d 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000010: 0200 0000 0200 0000 0300 0000 0100 0000  ................&lt;br/&gt;
0000020: 6c75 7374 7265 0000 0000 0000 0000 0000  lustre..........&lt;br/&gt;
0000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000060: 6c75 7374 7265 2d4f 5354 3030 3033 0000  lustre-OST0003..&lt;br/&gt;
0000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;br/&gt;
0000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................&lt;/p&gt;

&lt;p&gt;Save the file, and convert it back to binary:&lt;/p&gt;

&lt;p&gt;OSS# xxd -r /tmp/mountdata.ost01.asc /tmp/mountdata.ost03&lt;/p&gt;

&lt;p&gt;Mount the OST0003 filesystem locally and copy this new file in place:&lt;/p&gt;

&lt;p&gt;OSS# mount -t ldiskfs &lt;/p&gt;
{OST0003_dev}
&lt;p&gt; /mnt/lustre_ost03&lt;br/&gt;
OSS# mv /mnt/lustre_ost03/CONFIGS/mountdata /mnt/lustre_ost03/CONFIGS/mountdata.broken&lt;br/&gt;
OSS# cp /tmp/mountdata.ost03 /mnt/lustre_ost03/CONFIGS/mountdata&lt;br/&gt;
OSS# umount /mnt/lustre_ost03&lt;/p&gt;

&lt;p&gt;The OST should now mount normally and identify itself as OST0003.&lt;/p&gt;</comment>
                            <comment id="13631" author="pjones" created="Tue, 3 May 2011 22:07:32 +0000"  >&lt;p&gt;Thanks Sam. It is interesting to hear a PS perspective. I know that you were involved in a number of similar deployments. It will be interesting to hear the assessment from engineering about whether a Lustre issue is indeed involved here. Andreas, what do you think?&lt;/p&gt;</comment>
                            <comment id="13656" author="johann" created="Wed, 4 May 2011 06:48:44 +0000"  >&lt;p&gt;&amp;gt; Regarding the CAMs and drive upgrades, we have seen the corrupted OSTs before on the Riverwalks&lt;br/&gt;
&amp;gt; (J4400&apos;s) when disk firmware was upgraded without both Lustre and the md software raid shutdown&lt;br/&gt;
&amp;gt; cleanly first. Is there any chance that this particular OST10 was not cleanly shutdown? We saw&lt;br/&gt;
&amp;gt; many cases of software RAID corruption on the J4400&apos;s a couple of years ago,&lt;/p&gt;

&lt;p&gt;Beyond the HW/firmware issues, there was also a corruption problem due to the the mptsas driver&lt;br/&gt;
which could redirect I/Os to the wrong drive &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/sad.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;

&lt;p&gt;The following comment from Sven explains how this bug was discovered:&lt;br/&gt;
&lt;a href=&quot;https://bugzilla.lustre.org/show_bug.cgi?id=21819#c27&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://bugzilla.lustre.org/show_bug.cgi?id=21819#c27&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And the problem was fixed in the following bugzilla ticket:&lt;br/&gt;
&lt;a href=&quot;https://bugzilla.lustre.org/show_bug.cgi?id=22632&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://bugzilla.lustre.org/show_bug.cgi?id=22632&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However, it requires to install an extra package including the mptsas driver.&lt;/p&gt;

&lt;p&gt;Are you sure to use a mptsas driver which does not suffer from the same issue?&lt;/p&gt;

&lt;p&gt;&amp;gt; which was about the time early versions of 1.8 started to be used. There were several software&lt;br/&gt;
&amp;gt; RAID corruption bugs that have since been fixed. Also, we have fixed many problems since the&lt;br/&gt;
&amp;gt; early 1.8 releases in Lustre, so would encourage an upgrade to 1.8.5 at your earliest convenience.&lt;/p&gt;

&lt;p&gt;We indeed integrated several software raid fixes in 1.8 (e.g. bugzilla 19990, 22509 &amp;amp; 20533).&lt;br/&gt;
Although i don&apos;t think any of them fixed real software RAID corruptions, it would still make&lt;br/&gt;
sense to upgrade to 1.8.5 to benefit from those bug fixes which address real deadlocks and oops.&lt;/p&gt;</comment>
                            <comment id="13660" author="tyler.s.wiegers@lmco.com" created="Wed, 4 May 2011 07:46:36 +0000"  >&lt;p&gt;Sam,&lt;/p&gt;

&lt;p&gt;When we did the firmware upgrades we had taken down lustre and rebooted every box to make sure it was all in a clean/unmounted state.  We had 2 OST&apos;s not mounting at that point, with this most current problem popping up after the firmware upgrades.  I&apos;m not entirely convinced that the firmware upgrades actually caused this particular problem, we&apos;ve been doing a lot to try to recover these OST&apos;s.&lt;/p&gt;

&lt;p&gt;Andreas,&lt;/p&gt;

&lt;p&gt;I will get our guys looking at the mountdata file right now.  Hopefully we&apos;ll have an indication of whether this action helps in an hour or so.&lt;/p&gt;

&lt;p&gt;Thank you all so much for your support!&lt;/p&gt;</comment>
                            <comment id="13680" author="pjones" created="Wed, 4 May 2011 13:20:23 +0000"  >&lt;p&gt;Update from site - e2fsck completed on all OSTs and now running a full e2fsck before bringing filesystem back online&lt;/p&gt;</comment>
                            <comment id="13681" author="tyler.s.wiegers@lmco.com" created="Wed, 4 May 2011 13:30:26 +0000"  >&lt;p&gt;Thanks Peter, I was actually in the process of updating the bugs with our most up to date status and actions taken (the site was down earlier this morning when I tried).&lt;/p&gt;


&lt;p&gt;Again, we appreciate your support with all this!&lt;/p&gt;


</comment>
                            <comment id="13682" author="tyler.s.wiegers@lmco.com" created="Wed, 4 May 2011 13:33:11 +0000"  >&lt;p&gt;Andreas, your procedure worked flawlessly and our OST is back up and running.  We verified that the mountdata file was indeed zero length.&lt;/p&gt;

&lt;p&gt;One clarification that I would like to make though, we copied from ost7 and the following line to edit was different that what you had provided (for the entry to edit):&lt;/p&gt;

&lt;p&gt;0000010: 0200 0000 0200 0000 0700 0000 0100 0000&lt;/p&gt;

&lt;p&gt;This line you had indicated to modify the 7th entry, when we copied from ost07 it looked like the 5th entry should be modified instead.&lt;/p&gt;</comment>
                            <comment id="13695" author="adilger" created="Wed, 4 May 2011 14:36:37 +0000"  >&lt;p&gt;You are correct - my sincere apologies.  I was counting 2-byte fields starting in the second row instead of 4-byte fields starting in the first row.  I&apos;ve corrected the instructions in this bug in case they are re-used for similar problems in the future.  We&apos;ve discussed in the past to have a tool to repair this file automatically in case of corruption, and that is underscored by this issue.&lt;/p&gt;

&lt;p&gt;It looks like you (correctly) modified the 5th column, so all is well and no further action is needed.&lt;/p&gt;

&lt;p&gt;It looks like you couldn&apos;t have modified the 7th column, or the OST would have failed to mount.  I did an audit of the code to see what is using these fields (the correct ldd_svindex field and the incorrect ldd_mount_type field).  I found that the ldd_svindex field is only used in case the configuration database on the MGS is rewritten (due to --writeconf) and the OST is reconnecting to the MGS to recreate the configuration record.  The ldd_mount_type field is used to determine the backing filesystem type (usually &quot;ldiskfs&quot; for type = 0x0001, but would have been &quot;reiserfs&quot; with type = 0x0003).&lt;/p&gt;

&lt;p&gt;If you want to be a bit safer in the future, you could use the &quot;debugfs&quot; command posted earlier to dump this file from all of the OSTs (it can safely be done while the OST is mounted) and save them to a safe location.&lt;/p&gt;

&lt;p&gt;Again, apologies for the mixup.&lt;/p&gt;</comment>
                            <comment id="13704" author="tyler.s.wiegers@lmco.com" created="Wed, 4 May 2011 16:59:16 +0000"  >&lt;p&gt;Thanks Andreas&lt;/p&gt;

&lt;p&gt;Where we are at right now is that all the OST&apos;s can be mounted, however lustre cannot be successfully mounted.&lt;/p&gt;

&lt;p&gt;After having issues initially, we shut down all of our lustre clients, and cleanly rebooted all of our OSSs and MDSs.  After bringing all the OSTs up, we had 2 OSTs (11 and 15) be in a &quot;recovering&quot; state that never finished (about 15 minutes after bringing up the client).  We used lctl to abort recovery, and attempted mounting, which apeared to be successful.  Running a df on /lustre after that segmentation faults.&lt;/p&gt;

&lt;p&gt;Additionally, when running lfs df throws the following error when it gets to ost11:&lt;br/&gt;
error: llapi_obd_statfs failed: Bad address (-14)&lt;/p&gt;

&lt;p&gt;Doing an lctl dl on a client have all the OSTs as &quot;UP&quot;, but the last number on each line is different for OST11 and OST15 (it&apos;s 5 for all OSTs, 4 for OST11/15)&lt;/p&gt;

&lt;p&gt;The mds&apos;s were showing that all the OSTs were &quot;UP&quot; as well, but the last numbers show all OSTs as 5&lt;/p&gt;</comment>
                            <comment id="13707" author="tyler.s.wiegers@lmco.com" created="Wed, 4 May 2011 18:13:45 +0000"  >&lt;p&gt;Some additional data points.&lt;/p&gt;

&lt;p&gt;After unmounting and resetting, ost11 and 15 complete recovery ok, but we still aren&apos;t able to mount lustre on a client.&lt;/p&gt;

&lt;p&gt;OST 11 and 15 are showing very different % used values than all of our other OSTs (they should all be even because of the stripes we use).&lt;/p&gt;

&lt;p&gt;In messages on our MDT server (mds2) we get messages stating that ost11 is &quot;INACTIVE&quot; by administrator request.&lt;/p&gt;

&lt;p&gt;We also see eviction messages when trying to mount a client for ost11 and 15:&lt;br/&gt;
This client was evicted by lustre-OST000b; in progress operations using this service will fail&lt;/p&gt;</comment>
                            <comment id="13709" author="adilger" created="Wed, 4 May 2011 19:05:17 +0000"  >&lt;p&gt;Did ost11 and ost15 have any filesystem corruption when you ran e2fsck on them?&lt;/p&gt;

&lt;p&gt;When you report that the %used is different, is that from &quot;lfs df&quot; or &quot;lfs df -i&quot;, or from &quot;df&quot; on the OSS node for the local OST mounpoints?&lt;/p&gt;

&lt;p&gt;You can check the recovery state of all OSTs on an OSS via &quot;lctl get_param obdfilter.*.recovery_status&quot;.  They should all report &quot;status: COMPLETE&quot; (or &quot;INACTIVE&quot; if recovery was never done since the OST was mounted).&lt;/p&gt;

&lt;p&gt;As for the OSTs being marked inactive, you can check the status of the connections on the MDS and clients via &quot;lctl get_param osc.*.state&quot;.  All of the connections should report &quot;current_state: FULL&quot; meaning that the OSCs are connected to the OSTs.  Even so, if the OSTs are not started for some reason, it shouldn&apos;t prevent the clients from mounting.&lt;/p&gt;

&lt;p&gt;Can you please attach an excerpt from the syslog for a client trying to mount, and also from OST11 and OST15.&lt;/p&gt;</comment>
                            <comment id="13710" author="tyler.s.wiegers@lmco.com" created="Wed, 4 May 2011 19:15:25 +0000"  >&lt;p&gt;We&apos;re getting those logs for you now, we have to re-type them since they are on a segregated system.  We are strapped for time so as soon as you can respond that would be great, if we don&apos;t have this back up tomorrow morning we get to rebuild lustre to get the system up.&lt;/p&gt;

&lt;p&gt;If you are available for a phone call that would be great as well, we are available all night if necessary.&lt;/p&gt;

&lt;p&gt;Thanks!&lt;/p&gt;</comment>
                            <comment id="13711" author="tyler.s.wiegers@lmco.com" created="Wed, 4 May 2011 19:25:16 +0000"  >&lt;p&gt;ost15 had a fairly large amount of filesystem corruption when running the e2fsck.  We used a lustre restore from lost and found command to attempt to restore that data.  ost11 did not have corruption I don&apos;t beleive.&lt;/p&gt;


&lt;p&gt;The recovery status using lctl get_param obdfilter.*.recovery_status on the oss shows everything as COMPLETE, which is good.&lt;/p&gt;


&lt;p&gt;Using lctl get_param osc.*.import (not state):&lt;/p&gt;

&lt;p&gt;The mds shows state as FULL for all OSTs, which is good&lt;/p&gt;

&lt;p&gt;The client shows state as NEW for OST 11 and 15, but FULL for all others.  There are also 3 entries for OST11 and 15 in this listing&lt;/p&gt;


&lt;p&gt;We&apos;re working on the log output for attempting to mount&lt;/p&gt;</comment>
                            <comment id="13712" author="tyler.s.wiegers@lmco.com" created="Wed, 4 May 2011 19:33:53 +0000"  >&lt;p&gt;There were no logs in the oss while attempting to mount&lt;/p&gt;

&lt;p&gt;The client messages file has the following (minus date stamps to save typing time):&lt;/p&gt;

&lt;p&gt;lustre-clilov-ffff81036703fc00.lov: set parameter stripesize=1048576&lt;br/&gt;
Skipped 4 previous similar messages&lt;br/&gt;
setting import lustre-OST000b_UUID INACTIVE by administrator request&lt;br/&gt;
Skipped 1 previous similar message&lt;br/&gt;
LustreError: 7116:0:(lov_obd.c:325:lov_connect_obd()) not connecting OSC lustre-OST000b_UUID; administratively disabled&lt;br/&gt;
Skipped 1 previous similar message&lt;br/&gt;
Client lustre-client has started&lt;br/&gt;
general protection fault: 0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4&amp;#93;&lt;/span&gt; SMP&lt;br/&gt;
last sysfs file: /class/infiniband/mlx4_1/node_desc&lt;br/&gt;
CPU 0&lt;br/&gt;
Modules linked in: ~~~~ lots of modules&lt;/p&gt;

&lt;p&gt;After this we did a df command and it segmentation faults&lt;/p&gt;




&lt;p&gt;Also, we see different sizes for the OST&apos;s using a normal df command on the OSS.  doing a lfs df on the clients show different %&apos;s for the good OSTs, but it comes back with the Bad address (-14) error when it gets to ost11, so I can&apos;t tell what that would say.  lfs df -i shows 0%, but still fails at ost11&lt;/p&gt;</comment>
                            <comment id="13713" author="tyler.s.wiegers@lmco.com" created="Wed, 4 May 2011 19:44:58 +0000"  >&lt;p&gt;Also, there is no data on this system that we absolutely need to recover, it is purely a high speed data store for temporary data.  Do you believe there is any value to continueing this troubleshooting, or is rebuilding lustre filesystems at this point a good idea?&lt;/p&gt;

&lt;p&gt;We will be delivering this system into operations as a new technology within the next couple of weeks, so our concern is that we have an opportunity to learn something that &lt;b&gt;may&lt;/b&gt; help in future operations.  Is this situation something that can happen often and that we need to plan for, or is this a huge fluke that we shouldn&apos;t ever expect?&lt;/p&gt;

&lt;p&gt;Thanks!&lt;/p&gt;</comment>
                            <comment id="13714" author="adilger" created="Wed, 4 May 2011 20:29:46 +0000"  >&lt;p&gt;Tyler, I left a VM for you on the number you provided in email.&lt;/p&gt;

&lt;p&gt;For the OOPS message, the easiest way to handle that would be to take a photo of the screen and attach it.  Otherwise, having the actual error message (e.g. NULL pointer dereference at ...), the process name, and the list of function names from the top of the stack (i.e. those functions most recently called) would help debug that problem.&lt;/p&gt;

&lt;p&gt;Normall, if e2fsck is successful for the OST, then Lustre should generally be able to mount the filesystem and run with it, regardless of what corruptions there were in the past, but of course I can&apos;t know what other kinds of corruptions there might be that are causing strange problems.&lt;/p&gt;

&lt;p&gt;I definitely would not classify such problems as something that happens often, so while understanding what is going wrong and fixing it is useful to us, you need to make a decision on the value of the data in the filesystem to the users vs. the downtime it is taking to debug this problem.  Of course it would be easier and faster to debug with direct access to the logs, but there are many such sites disconnected from the internet that are running Lustre, so this is nothing new.&lt;/p&gt;

&lt;p&gt;Depending on the site&apos;s tolerance for letting data out, there are a number of ways we&apos;ve worked with such sites in the past.  One way is to print the logs and then scan them on an internet-connected system and attach them to the bug.  This maintains an &quot;air gap&quot; for the system while still being relatively high bandwidth, if there is nothing sensitive in the log files themselves.&lt;/p&gt;

&lt;p&gt;If you are not already in a production situation, I would strongly recommend upgrading to Lustre 1.8.5.  This is running stably on many systems, and given the difficulty in diagnosing some of the problems you have already seen, it would be unfortunate to have to diagnose problems that were already fixed in under more difficult circumstances.  Conversely, I know of very few 1.8.x sites that are still running 1.8.0.1 anymore.&lt;/p&gt;</comment>
                            <comment id="13756" author="adilger" created="Thu, 5 May 2011 11:32:25 +0000"  >&lt;p&gt;Just as an update to the bug, Tyler and I spoke at length on the phone this morning.  After a restart of the OSTs and clients, the filesystem was able to mount without problems and at least &quot;lfs df&quot; worked for all OSTs while we were on the phone.&lt;/p&gt;

&lt;p&gt;However, the corruption on some of the OSTs, and the fact that all files are striped over all OSTs mean that some fraction of all files in the filesystem will have missing data.  Since the filesystem is used only as a staging area, it is recommended that the filesystem is simply reformatted to get it back into a known state instead of spending more time isolating which files were corrupted and then having to restore them into the filesystem anyway.  This will also avoid any potential bugs/or data corruption that may not be evident with limited testing.&lt;/p&gt;


&lt;p&gt;We also discussed the current default configuration of striping all files across all 16 OSTs.  I recommended to Tyler to use the &quot;lfs setstripe -c &lt;/p&gt;
{stripes}
&lt;p&gt; &lt;/p&gt;
{new file}
&lt;p&gt;&quot; command to create some test files with different numbers of stripes and measure the performance to determine the minimum stripe count that will hit the peak single-client performance, since the clients are largely doing independent IO to different files.  At that point, running multiple parallel read/write jobs on files with the smaller stripe count should be compared with running the same workload on all wide-striped files.&lt;/p&gt;

&lt;p&gt;Based on our discussion of the workload, it seems likely that the IO performance of a small number of OSTs (2-4) would be as fast as the current peak performance seen by the clients, while reducing contention on the OSTs when multiple clients are doing IO.  Reducing the stripe count may potentially increase the aggregate performance seen by multiple clients doing concurrent IO, because there is less chance of contention (seeking) on the OSTs being used by multiple clients.&lt;/p&gt;

&lt;p&gt;Reducing the stripe count would also help isolate the clients from any problems or slowdowns caused by individual OSTs.  If an OST is unavailable, then any file that is striped over that OST will also be unavailable.  &lt;/p&gt;

&lt;p&gt;If an OST is slow for some reason (e.g. RAID rebuild, marginal disk hardware, etc) then the IO to that file will be limited by the slowest OST, so the more OSTs a file is striped over the more likely such a problem is to hit a particular file.  That said, if there is a minimum bandwidth requirement for a single file, instead of a desire to maximize the aggregate performance of multiple clients doing independent IO, then there needs to be enough stripes on the file so that N * &lt;/p&gt;
{slow OST}
&lt;p&gt; is still fast enough to meet that minimum bandwidth.&lt;/p&gt;</comment>
                            <comment id="13910" author="johann" created="Fri, 6 May 2011 13:29:43 +0000"  >&lt;p&gt;Tyler, BTW, i think it still makes sense to check that you are not using a mptsas driver suffering from bugzilla ticket 22632.&lt;/p&gt;</comment>
                            <comment id="14005" author="pjones" created="Mon, 9 May 2011 07:48:21 +0000"  >&lt;p&gt;Rob Baker of LMCO has confirmed that the critical situation is over and production is stable. Residual issues will be tracked under a new ticket in the future.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw153:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10266</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>