<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:10:24 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-7611] OSTs become &quot;not healthy&quot;</title>
                <link>https://jira.whamcloud.com/browse/LU-7611</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We had a hard power outage early this morning.  After hardware fixes, we were about to mount one of our file systems and it appears to be healthy.&lt;/p&gt;

&lt;p&gt;On the other file system, we have 5 OSTs that have reported unhealthy.  The file system mounted fine, and clients connected, but soon after, I started receiving alerts. I&apos;m in the process of rebooting the nodes so that I can e2fsck them. On each node that has an unhealthy OST, they have messages similar to the following:&lt;/p&gt;

&lt;p&gt;[ 2962.001970] LustreError: 24264:0:(tgt_lastrcvd.c:583:tgt_client_new()) atlas2-OST00ca: Failed to write client lcd at idx 19, rc -30&lt;br/&gt;
[ 2962.029196] LustreError: 24264:0:(tgt_lastrcvd.c:583:tgt_client_new()) Skipped 140 previous similar messages&lt;br/&gt;
[ 3263.951324] LustreError: 24172:0:(ofd_obd.c:1365:ofd_create()) atlas2-OST00ca: unable to precreate: rc = -30&lt;br/&gt;
[ 3263.981018] LustreError: 24172:0:(ofd_obd.c:1365:ofd_create()) Skipped 61 previous similar messages&lt;br/&gt;
[ 3562.220443] LustreError: 24251:0:(tgt_lastrcvd.c:583:tgt_client_new()) atlas2-OST00ca: Failed to write client lcd at idx 19, rc -30&lt;br/&gt;
[ 3562.252970] LustreError: 24251:0:(tgt_lastrcvd.c:583:tgt_client_new()) Skipped 233 previous similar messages&lt;br/&gt;
[ 3874.190928] LustreError: 24281:0:(ofd_obd.c:1365:ofd_create()) atlas2-OST00ca: unable to precreate: rc = -30&lt;br/&gt;
[ 3874.219280] LustreError: 24281:0:(ofd_obd.c:1365:ofd_create()) Skipped 62 previous similar messages&lt;/p&gt;

&lt;p&gt;Is an e2fsck the proper response in this type of situation?&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
&amp;#8211;&lt;br/&gt;
Jesse&lt;/p&gt;</description>
                <environment>RHEL 6.6&lt;br/&gt;
2.6.32-504.30.3.el6.atlas.x86_64</environment>
        <key id="33855">LU-7611</key>
            <summary>OSTs become &quot;not healthy&quot;</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="10000">Done</resolution>
                                        <assignee username="yujian">Jian Yu</assignee>
                                    <reporter username="hanleyja">Jesse Hanley</reporter>
                        <labels>
                    </labels>
                <created>Thu, 24 Dec 2015 21:45:16 +0000</created>
                <updated>Mon, 28 Mar 2016 15:08:02 +0000</updated>
                            <resolved>Mon, 28 Mar 2016 15:08:02 +0000</resolved>
                                    <version>Lustre 2.5.3</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="137437" author="yujian" created="Thu, 24 Dec 2015 21:49:25 +0000"  >&lt;p&gt;Yes, Jesse. Please refer to &lt;a href=&quot;https://build.hpdd.intel.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.html#dbdoclet.50438225_71141&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://build.hpdd.intel.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.html#dbdoclet.50438225_71141&lt;/a&gt; .&lt;/p&gt;</comment>
                            <comment id="137439" author="hanleyja" created="Thu, 24 Dec 2015 22:09:49 +0000"  >&lt;p&gt;Thanks Jian!  About to start the e2fsck runs now.&lt;/p&gt;</comment>
                            <comment id="137442" author="jfc" created="Thu, 24 Dec 2015 23:02:50 +0000"  >&lt;p&gt;Thanks Jesse. I&apos;m asking Jian to watch this ticket, for the time being.&lt;/p&gt;

&lt;p&gt;Please let us know how the e2fsck runs proceed for you.&lt;/p&gt;

&lt;p&gt;~ jfc.&lt;/p&gt;</comment>
                            <comment id="139622" author="hanleyja" created="Thu, 21 Jan 2016 18:32:14 +0000"  >&lt;p&gt;Sorry for the delay on this - we managed to get the file systems back up.  We ran e2fscks on all target OSTs during an outage and got the following output (I removed some of the superfluous and &lt;tt&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;QUOTA WARNING&amp;#93;&lt;/span&gt;&lt;/tt&gt; lines):&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;atlas1-OST02e8: 1471186/29343744 files (7.2% non-contiguous), 2281519450/3755999232 blocks
Inode 5712358, i_size is 2097152, should be 2138112.  Fix? no


atlas1-OST00aa: 1418001/29343744 files (7.1% non-contiguous), 2267986225/3755999232 blocks
Deleted inode 1184018 has zero dtime.  Fix? no

Inode bitmap differences:  -1184018
Fix? no


atlas1-OST0212: 1530414/29343744 files (7.2% non-contiguous), 2274401960/3755999232 blocks
Deleted inode 94387 has zero dtime.  Fix? no

Inode bitmap differences:  -94387
Fix? no


atlas1-OST0088: 1246831/29343744 files (7.4% non-contiguous), 2518723619/3755999232 blocks
[ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (1527054912 &amp;gt;= 2690) in user quota file. Quota file is probably corrupted.
Please run e2fsck (8) to fix it.
[ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (20480 &amp;gt;= 2690) in user quota file. Quota file is probably corrupted.
Please run e2fsck (8) to fix it.
[ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (897567104 &amp;gt;= 2690) in user quota file. Quota file is probably corrupted.
Please run e2fsck (8) to fix it.
[ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (4096 &amp;gt;= 2690) in user quota file. Quota file is probably corrupted.
Please run e2fsck (8) to fix it.


atlas2-OST033f: 1181549/29343744 files (1.5% non-contiguous), 2164774412/3755999232 blocks
Deleted inode 1791 has zero dtime.  Fix? no

Block bitmap differences:  -(77247232--77247743)
Fix? no

Inode bitmap differences:  -1791
Fix? no


atlas2-OST0228: 1277960/29343744 files (1.6% non-contiguous), 1921932726/3755999232 blocks
Deleted inode 4916172 has zero dtime.  Fix? no

Inode bitmap differences:  -4916172
Fix? no
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Do you see anything alarming about this output, or should we be fine to run e2fsck (with -y or -p).&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
&amp;#8211;&lt;br/&gt;
Jesse&lt;/p&gt;</comment>
                            <comment id="139627" author="yujian" created="Thu, 21 Jan 2016 18:47:19 +0000"  >&lt;p&gt;Hi Niu,&lt;/p&gt;

&lt;p&gt;Could you please take a look at the above outputs of e2fsck and advise? Thank you.&lt;/p&gt;</comment>
                            <comment id="139703" author="niu" created="Fri, 22 Jan 2016 03:24:01 +0000"  >&lt;p&gt;I think it should be fine to run e2fsck -y to repair the system, please make sure to use the latest recommended e2fsprogs (1.42.13.wc4, I believe), which has several recent defect fixing included.&lt;/p&gt;</comment>
                            <comment id="146970" author="jfc" created="Fri, 25 Mar 2016 20:38:14 +0000"  >&lt;p&gt;Hello Jesse,&lt;/p&gt;

&lt;p&gt;Do you need any more work done on this ticket?&lt;/p&gt;

&lt;p&gt;Many thanks,&lt;br/&gt;
~ jfc.&lt;/p&gt;</comment>
                            <comment id="147018" author="hanleyja" created="Mon, 28 Mar 2016 11:49:16 +0000"  >&lt;p&gt;Hey John,&lt;/p&gt;

&lt;p&gt;We were able to complete the e2fsck runs without an issue.  Everything appeared to be in a healthy state afterwards.  Please feel free to resolve this, and thanks for the help and advice.&lt;/p&gt;

&lt;p&gt;&amp;#8211;&lt;br/&gt;
Jesse&lt;/p&gt;</comment>
                            <comment id="147038" author="jfc" created="Mon, 28 Mar 2016 15:08:02 +0000"  >&lt;p&gt;Thank you Jesse &amp;#8211; glad everything worked out well.&lt;/p&gt;

&lt;p&gt;~ jfc.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxwtj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>