<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:54:50 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-5822] health_check file not updating properly</title>
                <link>https://jira.whamcloud.com/browse/LU-5822</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Over the weekend we had an OST abort and get marked read-only:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;[  726.076561] LDISKFS-fs error (device dm-25): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; group 111692corrupted: 32768 blocks free in bitmap, 1024 - in gd
[  726.116663] 
[  726.125085] Aborting journal on device dm-25-8.
[  726.133359] LustreError: 17032:0:(ofd_obd.c:1095:ofd_destroy()) f1-OST00ff: error destroying object [0x100000000:0x16546e7:0x0]: 0
[  726.176268] LDISKFS-fs (dm-25): 
[  726.179457] LDISKFS-fs error (device dm-25): ldiskfs_journal_start_sb: Detected aborted journal
[  726.179459] LDISKFS-fs (dm-25): Remounting filesystem read-only&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We rely on the /proc/fs/lustre/health_check file to notify us of these situations.  Unfortunately, we never got a notification.  I found a bug in the b2_5 implementation of the osd-ldiskfs osd_statfs() function.  Code inspection leads me to believe it does not affect master, but I haven&apos;t tried it there.  I will upload a patch momentarily.&lt;/p&gt;</description>
                <environment></environment>
        <key id="27356">LU-5822</key>
            <summary>health_check file not updating properly</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="jamesanunez">James Nunez</assignee>
                                    <reporter username="ezell">Matt Ezell</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Tue, 28 Oct 2014 20:43:38 +0000</created>
                <updated>Fri, 24 Apr 2015 23:55:02 +0000</updated>
                            <resolved>Thu, 4 Dec 2014 22:45:38 +0000</resolved>
                                    <version>Lustre 2.5.3</version>
                                    <fixVersion>Lustre 2.5.4</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="97759" author="ezell" created="Tue, 28 Oct 2014 20:48:58 +0000"  >&lt;p&gt;I&apos;d like to also add a test to make sure this doesn&apos;t break in the future, but I&apos;m not sure of the best way to make the device go read only.  I started with the read-only infrastructure that the test system uses, but that appears to set the underlying device to read-only, not the ldiskfs file system.&lt;/p&gt;

&lt;p&gt;First, I tried to implement a server_remount_fs() function so you could do &apos;mount -o remount,ro&apos;, but that gets a Lustre superblock that you would then need to call down into the osd-api to actually make it do anything to the underlying file system.&lt;br/&gt;
I then looked at adding an IOCTL that lctl could call, but that also appears to require support from the osd-api.&lt;br/&gt;
I have a prototype patch that adds a new osd-api method, dt_abort_device() that could be called from either remount or lctl, but I&apos;m not sure if it makes sense to add a new method just for testing this.  Thoughts?&lt;/p&gt;

&lt;p&gt;Is there an easier way to cause the underlying filesystem to abort or go read-only?&lt;/p&gt;</comment>
                            <comment id="97761" author="ezell" created="Tue, 28 Oct 2014 21:09:58 +0000"  >&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/12463&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/12463&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="97764" author="jamesanunez" created="Tue, 28 Oct 2014 21:42:13 +0000"  >&lt;p&gt;Matt, &lt;/p&gt;

&lt;p&gt;Thanks for the patch. I&apos;ll look into a test to check that health_check is being updated properly.&lt;/p&gt;

&lt;p&gt;James&lt;/p&gt;</comment>
                            <comment id="97786" author="adilger" created="Wed, 29 Oct 2014 00:57:31 +0000"  >&lt;p&gt;There was some work done in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-137&quot; title=&quot;ioctl passthrough mechanism for Lustre OST/MDT mountpoints&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-137&quot;&gt;&lt;del&gt;LU-137&lt;/del&gt;&lt;/a&gt; to allow ioctl() pass through to the underlying filesystem, but this was complicated in the 2.4+ releases by the OSD API and the addition of ZFS. While I would be happy to see that work move forward, but it is probably overkill for this. &lt;/p&gt;

&lt;p&gt;It would be easier to use the existing fault injection method for Lustre and add a new FAIL_LOC for this case, like:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (sb-&amp;gt;s_flags &amp;amp; MS_RDONLY ||
            (OBD_FAIL_CHECK(OBD_FAIL_OSD_READONLY) &amp;amp;&amp;amp;
             osd-&amp;gt;od_jndex == libcfs_fail_val))
                osd-&amp;gt;od_statfs.os_state = OS_STATE_READONLY;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;or similar (this is just from the top of my head so the syntax might not be quite correct).&lt;/p&gt;</comment>
                            <comment id="97787" author="adilger" created="Wed, 29 Oct 2014 00:58:45 +0000"  >&lt;p&gt;PS: I verified that this patch is not needed for master.&lt;/p&gt;</comment>
                            <comment id="100737" author="gerrit" created="Thu, 4 Dec 2014 20:22:09 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/12463/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/12463/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5822&quot; title=&quot;health_check file not updating properly&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5822&quot;&gt;&lt;del&gt;LU-5822&lt;/del&gt;&lt;/a&gt; osd-ldiskfs: Correctly return OS_STATE_READONLY&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_5&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 1ff49a78e443f935670daf0c84b5b989c02dca04&lt;/p&gt;</comment>
                            <comment id="100772" author="pjones" created="Thu, 4 Dec 2014 22:45:38 +0000"  >&lt;p&gt;Landed for 2.5.4&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="10459">LU-137</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwzp3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>16321</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>