<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:17:36 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1547] MDT remounted read-only, MDS hung, MDT corrupted</title>
                <link>https://jira.whamcloud.com/browse/LU-1547</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Our customer experienced MDT remounted read-only after MDS relocation from sklusp01b to sklusp01a cluster node.&lt;br/&gt;
They also performed relocation of OSS services during the same time.&lt;br/&gt;
When they noticed the RO status they tried to stop the MDS. The attempt to stop MDS was unsuccessful, the server got unresponsive and the other cluster node (sklusp01b) fenced the sklusp01a MDS server and took over the MDT, The sklusp01b was stopped after take-over and&lt;br/&gt;
then they run fsck which ended with huge number of errors, The repair was unsuccessful. It ended with recreation of whole Lustre FS and restore from backup.&lt;br/&gt;
Is it possible to determine the root cause from logs?&lt;/p&gt;</description>
                <environment>OS RHEL 5.5 cluster, MDT, OST on LVM volumes, SAN, storage HP XP24k</environment>
        <key id="14993">LU-1547</key>
            <summary>MDT remounted read-only, MDS hung, MDT corrupted</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="6">Not a Bug</resolution>
                                        <assignee username="niu">Niu Yawei</assignee>
                                    <reporter username="hpsk">HP Slovakia team</reporter>
                        <labels>
                            <label>ldiskfs</label>
                    </labels>
                <created>Thu, 21 Jun 2012 07:56:35 +0000</created>
                <updated>Tue, 11 Mar 2014 12:32:28 +0000</updated>
                            <resolved>Tue, 11 Mar 2014 12:32:28 +0000</resolved>
                                    <version>Lustre 1.8.x (1.8.0 - 1.8.5)</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="40986" author="pjones" created="Thu, 21 Jun 2012 08:57:32 +0000"  >&lt;p&gt;Niu is investigating this one&lt;/p&gt;</comment>
                            <comment id="40987" author="niu" created="Thu, 21 Jun 2012 10:03:20 +0000"  >&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;Jun 17 23:03:08 sklusp01a kernel: LDISKFS-fs error (device dm-11): ldiskfs_lookup: unlinked inode 27720411 in dir #29287441
Jun 17 23:03:08 sklusp01a kernel: Remounting filesystem read-only
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Looks ldiskfs fail to find inode, which is an fs inconsistence error and caused RO. Not sure if it&apos;s an ext4 problem, I&apos;ll investigate it more. Andreas, Johann, any comments? Thanks.&lt;/p&gt;</comment>
                            <comment id="40994" author="hpsk" created="Thu, 21 Jun 2012 12:00:33 +0000"  >&lt;p&gt;New may be important information:&lt;br/&gt;
The MDT resides on a cluster LVM volume. Before the issue happaned the MDS was running on node sklusp01b and the customer&lt;br/&gt;
performed lvconvert -m1 to MDT on the other node sklusp01a to create a cross site mirror. When the resync finished they relocated the MDS to sklusp01a and the issue happaned about 10 minutes after relocation. Is it OK to run MDS on one node and lvconvert on other node for the same cluster LVM volume?&lt;/p&gt;</comment>
                            <comment id="40995" author="johann" created="Thu, 21 Jun 2012 12:07:37 +0000"  >&lt;p&gt;Could you please clarify what you intended to do with the lvconvert command?&lt;br/&gt;
I don&apos;t think it is safe to run such a command on one node while the volume is being accessed on another node (unless you use CLVM?).&lt;/p&gt;

&lt;p&gt;It is likely the root cause of your problem.&lt;/p&gt;</comment>
                            <comment id="41001" author="adilger" created="Thu, 21 Jun 2012 13:57:42 +0000"  >&lt;p&gt;The initial recovery appears to find a valid Lustre filesystem to mount:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Jun 17 22:52:52 sklusp01a kernel: Lustre: 11216:0:(mds_fs.c:677:mds_init_server_data()) RECOVERY: service l1-MDT0000, 56 recoverable clients, 0 delayed clients, last_transno 133173826553
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Later on, it finds a single error in the filesystem when it is cleaning up the orphan inodes:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Jun 17 23:03:08 sklusp01a kernel: LDISKFS-fs error (device dm-11): ldiskfs_lookup: unlinked inode 27720411 in dir #29287441
Jun 17 23:03:08 sklusp01a kernel: Remounting filesystem read-only
Jun 17 23:03:08 sklusp01a kernel: LDISKFS-fs warning (device dm-11): kmmpd: kmmpd being stopped since filesystem has been remounted as readonly.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After failover to sklusp01b (which was quickly shut down), the MDS service is again started on sklusp01a and sees the same error:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Jun 18 00:25:03 sklusp01a kernel: LDISKFS-fs error (device dm-14): ldiskfs_lookup: unlinked inode 27720411 in dir #29287441
Jun 18 00:25:03 sklusp01a kernel: Remounting filesystem read-only
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;At least during these times, the filesystem was intact enough to be able to mount and read basic Lustre configuration files.  I can&apos;t comment on the severity of the corruption seen by e2fsck, but the kernel only saw a relatively minor problem (directory entry for an open-unlinked inode was actually deleted, which may possibly relate to nlink problems previously fixed in &lt;a href=&quot;https://bugzilla.lustre.org/show_bug.cgi?id=22177&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://bugzilla.lustre.org/show_bug.cgi?id=22177&lt;/a&gt; for 1.8.3).&lt;/p&gt;

&lt;p&gt;It also appears you have MMP enabled on this filesystem, which would normally prevent it from being mounted on two nodes at the same time.  From the timestamps in the logs, it does not appear that the two MDS services were active at the same time on the two nodes.&lt;/p&gt;

&lt;p&gt;Unfortunately, I&apos;m not familiar enough with the details of CLVM and what lvconvert does in this case to comment on whether this is safe to do on a running system or not.   It is possible that &quot;lconvert&quot; and/or the mirror resync process incorrectly mirrored the LV between nodes, possibly getting some part of the device inconsistent between the two MDS nodes.  It is also possible (depending on how IO was being done by LVM to keep the mirrors in sync) that data was still in cache on sklusp01b, and not flushed to disk on sklusp01a at the time of failover.&lt;/p&gt;

&lt;p&gt;The MDS does operations asynchronously in memory, and only flushes them to disk every few seconds at transaction commit time.  Conversely, the OSS does writes synchronously to disk because this avoids too much memory pressure at high IO rates, so it may be the same inconsistency would not be visible on the OSS due to frequent sync of data to disk.&lt;/p&gt;

&lt;p&gt;Having the output from e2fsck would allow a guess at what type of corruption was seen, and how it might be introduced.&lt;/p&gt;</comment>
                            <comment id="41008" author="hpsk" created="Thu, 21 Jun 2012 15:38:53 +0000"  >&lt;p&gt;I have attached the fsck -fn ... output. It was run when after failover to sklusp01b the MDS was stopped.&lt;/p&gt;</comment>
                            <comment id="41010" author="hpsk" created="Thu, 21 Jun 2012 16:01:14 +0000"  >&lt;p&gt;To Johann&apos;s question:&lt;br/&gt;
The customer has disaster tolerant design with two datacenters (A and B) MDT and all 6 OSTs are on CLVM volumes mirorred across sites, MDS and OSSs are configured as cluster services. The MDS usually run on site A (sklusp01a), 3 OSSs on site A other 3 on B. During the weekend they had a planned power outage on site A. Before the PO they relocated the MDS and 3 OSSs from A to B and did lvconvert -m0 before the SAN, storage and servers were shut down. When the power outage was over, storage, SAN and servers were started again the CLVM volumes were converted back to mirror.&lt;br/&gt;
Akos&lt;/p&gt;</comment>
                            <comment id="78779" author="jfc" created="Fri, 7 Mar 2014 23:47:07 +0000"  >&lt;p&gt;Akos and HP Slovakia team,&lt;br/&gt;
Is there any further action required on this ticket, or can I mark it as resolved?&lt;br/&gt;
Many thanks,&lt;br/&gt;
~ jfc.&lt;/p&gt;</comment>
                            <comment id="78971" author="hpsk" created="Tue, 11 Mar 2014 06:42:26 +0000"  >&lt;p&gt;John,&lt;/p&gt;

&lt;p&gt;ticket can be closed. The issue was caused by a bug in CLVM.&lt;br/&gt;
Best regards,&lt;br/&gt;
Akos&lt;/p&gt;</comment>
                            <comment id="78984" author="pjones" created="Tue, 11 Mar 2014 12:32:28 +0000"  >&lt;p&gt;Thanks Akos!&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="11645" name="fsck.out" size="1223267" author="hpsk" created="Thu, 21 Jun 2012 15:38:53 +0000"/>
                            <attachment id="11640" name="sklusp01a-messages" size="195457" author="hpsk" created="Thu, 21 Jun 2012 07:56:35 +0000"/>
                            <attachment id="11641" name="sklusp01b-messages" size="1380509" author="hpsk" created="Thu, 21 Jun 2012 07:56:36 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzv33b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4000</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>