<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:41:16 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4277] Integrate ZFS zpool resilver status with OFD OS_STATE_DEGRADED flag</title>
                <link>https://jira.whamcloud.com/browse/LU-4277</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The OFD statfs() handled can optionally add an OS_STATE_DEGRADED flag to the statfs reply, which the MDS uses to help decide which OSTs to allocate new file objects from. Unless all other OSTs are also degraded, offline, or full, the DEGRADED OSTs will be skipped for newly created files. &lt;/p&gt;

&lt;p&gt;This avoids the application waiting for slow writes because of the rebuild long after it has completed on other healthy OSTs. It also avoids the new writes from interfering with the OST rebuild process, so it is a double win.&lt;/p&gt;

&lt;p&gt;This was previously implemented as a /proc tunable suitable for mdadm or a hardware-RAID utility to set from userspace, but since ZFS RAID is in the kernel it should be possible to query this status directly from the kernel when the MDS statfs() arrives. &lt;/p&gt;</description>
                <environment></environment>
        <key id="22177">LU-4277</key>
            <summary>Integrate ZFS zpool resilver status with OFD OS_STATE_DEGRADED flag</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="utopiabound">Nathaniel Clark</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                            <label>zfs</label>
                    </labels>
                <created>Wed, 20 Nov 2013 06:28:21 +0000</created>
                <updated>Tue, 20 Mar 2018 21:41:29 +0000</updated>
                            <resolved>Tue, 6 Feb 2018 05:15:19 +0000</resolved>
                                    <version>Lustre 2.11.0</version>
                                    <fixVersion>Lustre 2.11.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>12</watches>
                                                                            <comments>
                            <comment id="72177" author="adilger" created="Sat, 23 Nov 2013 00:59:25 +0000"  >&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/8378&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8378&lt;/a&gt; is a basic patch to fix handling in the LOD code for DEGRADED and READONLY flags.  It doesn&apos;t yet fix the osd-zfs code in udmu_objset_statfs() that should be setting the flags.&lt;/p&gt;</comment>
                            <comment id="166291" author="adilger" created="Fri, 16 Sep 2016 23:34:21 +0000"  >&lt;p&gt;Don or Brian,&lt;br/&gt;
is there some straight forward way for osd-zfs at the DMU level to determine if ZFS is currently degraded and/or doing a drive resilver operation, or would this need some new API to access this info from the VDEV?  If we had some mechanism to determine this easily, I think it would be straight forward for someone to add this functionality to Lustre.  The alternative would be for ZED to set &lt;tt&gt;lctl set_param ofd.&amp;lt;ost&amp;gt;.degraded=1&lt;/tt&gt; from userspace when it detects a degraded device and/or when the device is undergoing resilvering, and clearing it afterward.&lt;/p&gt;</comment>
                            <comment id="169244" author="dbrady" created="Wed, 12 Oct 2016 05:44:16 +0000"  >&lt;p&gt;The degrade state is part of the vdev. Getting this info strictly through the spa interface would yield a ton of data (i.e. the entire config) and require nvlist parsing.  A new API, like a spa_get_vdev_state(),  to pull out the vdev state of the root vdev would be require to get at this state in a simple manner.&lt;/p&gt;

&lt;p&gt;We can easily set the state as it changes using a zedlet.  We now have a state change event for all healthy&amp;lt;--&amp;gt;degraded vdev states that could be used to initiate a check of the pool state and post that state via lctl as you suggest above.&lt;/p&gt;</comment>
                            <comment id="183638" author="dbrady" created="Mon, 6 Feb 2017 21:59:35 +0000"  >&lt;p&gt;Attached a zedlet, &lt;b&gt;statechange-lustre.sh&lt;/b&gt;, that will propagate degraded state changes from zfs to Lustre.&lt;/p&gt;</comment>
                            <comment id="183842" author="adilger" created="Tue, 7 Feb 2017 22:44:26 +0000"  >&lt;p&gt;It would be good to get the script in the form of a patch against the fs/lustre-release repo so that it can be reviewed properly.  Some general comments first, however:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;the license needs to be dual CDDL/GPL or possibly dual GPL/BSD so that there isn&apos;t any problem to package it with other GPL Lustre code (though technically GPL only affects distribution of binaries and not sources, it is better to avoid any uncertainty).&lt;/li&gt;
	&lt;li&gt;there are some pathnames that hard-code Don&apos;s home directory, which are not suitable for use in a script that is deployed in production.  The location of the &lt;tt&gt;zfs&lt;/tt&gt;, &lt;tt&gt;zpool&lt;/tt&gt;, &lt;tt&gt;lctl&lt;/tt&gt; and &lt;tt&gt;grep&lt;/tt&gt; commands should be found in &lt;tt&gt;$PATH&lt;/tt&gt;.&lt;/li&gt;
	&lt;li&gt;the ZFS pool GUID is hard-coded, or is that some sort of event GUID for the state change?&lt;/li&gt;
	&lt;li&gt;the echos are fine for a demo, but not suitable for production use if they are too noisy.  Maybe a non-issue if this script is only run rarely.&lt;/li&gt;
	&lt;li&gt;should it actually be an error if the state change is not &lt;tt&gt;DEGRADED&lt;/tt&gt; or &lt;tt&gt;ONLINE&lt;/tt&gt;?  I don&apos;t know what the impact of an error return from zedlet is, so maybe a non-issue?&lt;/li&gt;
	&lt;li&gt;I don&apos;t know if it makes sense for set_degraded_state() to check the current state before setting the new state.  This could introduce races, and I don&apos;t think it reduces them.  I don&apos;t think there is much more overhead to always (re)set the state rather than to check it first and then set it.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;My thought is that the script would be installed as part of the &lt;tt&gt;ost-mount-zfs&lt;/tt&gt; RPM in some directory (something like &lt;tt&gt;/etc/zfs/zed/zedlets.d/&lt;/tt&gt; akin to &lt;tt&gt;/etc/modprobe.d&lt;/tt&gt; or &lt;tt&gt;/etc/logrotate.d&lt;/tt&gt;) that is a place to dump zedlets that will be run (at least the next time zed is started) and do not really need any kind of edit from the user to specify the Lustre targets.  Then, it would get events from the kernel when a zpool becomes degraded and update &lt;tt&gt;lctl obdfilter.$target.degraded&lt;/tt&gt; for targets in that zpool, and does nothing for non-Lustre pools (e.g. root pool for OS).&lt;/p&gt;</comment>
                            <comment id="184821" author="dbrady" created="Tue, 14 Feb 2017 18:24:15 +0000"  >&lt;p&gt;Thanks Andreas for the feedback.  I inadvertently attached my local copy used for testing but I can provide the generic one. I&apos;ll also address the issues are repost an update.   Is there an example license block I can refer to?&lt;/p&gt;</comment>
                            <comment id="192559" author="adilger" created="Tue, 18 Apr 2017 18:42:07 +0000"  >&lt;p&gt;Don,&lt;br/&gt;
 is there a directory for zedlet scripts to be installed (e.g. &lt;tt&gt;/etc/zfs/zed.d/&lt;/tt&gt;) where they will be run automatically when installed?&lt;/p&gt;

&lt;p&gt;Then, it should be straight forward to submit a patch (per above process) to add your script as &lt;tt&gt;lustre/scripts/statechange-lustre.sh&lt;/tt&gt; and install it into the above directory via &lt;tt&gt;lustre/scripts/Makefile.am&lt;/tt&gt; if ZFS is enabled:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt; if ZFS_ENABLED
 sbin_SCRIPTS += zfsobj2fid

+zeddir = $(sysconfdir)/zfs/zed.d
+zed_SCRIPTS = statechange-lustre.sh
 endif
 :
 :
+EXTRA_DIST += statechange-lustre.sh

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;and then package it in &lt;tt&gt;lustre.spec.in&lt;/tt&gt; and &lt;tt&gt;lustre-dkms.spec.in&lt;/tt&gt; as part of the &lt;tt&gt;osd-zfs-mount&lt;/tt&gt; RPM (Lustre userspace tools for ZFS-backed targets):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt; %files osd-zfs-mount
 %defattr(-,root,root)
 %{_libdir}/@PACKAGE@/mount_osd_zfs.so
+%{_sysconfdir}/zfs/zed.d/statechange-lustre.sh

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, when ZFS server support is installed, your zedlet will also be installed on all the servers, and should start to handle the degraded/offline events automatically.&lt;/p&gt;</comment>
                            <comment id="193878" author="jsalians_intel" created="Fri, 28 Apr 2017 12:28:15 +0000"  >&lt;p&gt;The /etc/zfs/zed.d links are links to: /usr/libexec/zfs/zed.d/ for example: &lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;ls -lart /etc/zfs/zed.d/&lt;b&gt;notify&lt;/b&gt;.sh&lt;br/&gt;
lrwxrwxrwx. 1 root root 45 Apr 27 15:59 /etc/zfs/zed.d/scrub_finish-notify.sh -&amp;gt; /usr/libexec/zfs/zed.d/scrub_finish-notify.sh&lt;br/&gt;
lrwxrwxrwx. 1 root root 48 Apr 27 15:59 /etc/zfs/zed.d/resilver_finish-notify.sh -&amp;gt; /usr/libexec/zfs/zed.d/resilver_finish-notify.sh&lt;br/&gt;
lrwxrwxrwx. 1 root root 37 Apr 27 15:59 /etc/zfs/zed.d/data-notify.sh -&amp;gt; /usr/libexec/zfs/zed.d/data-notify.sh&lt;br/&gt;
lrwxrwxrwx. 1 root root 44 Apr 27 15:59 /etc/zfs/zed.d/statechange-notify.sh -&amp;gt; /usr/libexec/zfs/zed.d/statechange-notify.sh&lt;/li&gt;
&lt;/ol&gt;
</comment>
                            <comment id="216420" author="pjones" created="Fri, 15 Dec 2017 18:32:04 +0000"  >&lt;p&gt;Nathaniel&lt;/p&gt;

&lt;p&gt;Can you please see what is required to move this forward?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="218462" author="gerrit" created="Wed, 17 Jan 2018 22:24:49 +0000"  >&lt;p&gt;Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/30907&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/30907&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4277&quot; title=&quot;Integrate ZFS zpool resilver status with OFD OS_STATE_DEGRADED flag&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4277&quot;&gt;&lt;del&gt;LU-4277&lt;/del&gt;&lt;/a&gt; scripts: ofd status integrated with zpool status&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 2f58dcb71a6246ce29a3a548f9ccafd006b97d44&lt;/p&gt;</comment>
                            <comment id="220067" author="gerrit" created="Tue, 6 Feb 2018 04:28:44 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/30907/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/30907/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4277&quot; title=&quot;Integrate ZFS zpool resilver status with OFD OS_STATE_DEGRADED flag&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4277&quot;&gt;&lt;del&gt;LU-4277&lt;/del&gt;&lt;/a&gt; scripts: ofd status integrated with zpool status&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 8ef3ddd2f2798d04b495c8223673a38452ac5c99&lt;/p&gt;</comment>
                            <comment id="220093" author="pjones" created="Tue, 6 Feb 2018 05:15:19 +0000"  >&lt;p&gt;Landed for 2.11&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="30833">LU-6767</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="25221" name="statechange-lustre.sh" size="2806" author="dbrady" created="Mon, 6 Feb 2017 21:57:13 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw9rz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>11749</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>