<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:39:50 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4119] recovery time hard doesn&apos;t limit recovery duration</title>
                <link>https://jira.whamcloud.com/browse/LU-4119</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Firstly, I think there is a bug in extend_recovery_timer:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;        if (to &amp;gt; obd-&amp;gt;obd_recovery_time_hard)
                to = obd-&amp;gt;obd_recovery_time_hard;
        if (obd-&amp;gt;obd_recovery_timeout &amp;lt; to ||
            obd-&amp;gt;obd_recovery_timeout == obd-&amp;gt;obd_recovery_time_hard) {
                obd-&amp;gt;obd_recovery_timeout = to;
                cfs_timer_arm(&amp;amp;obd-&amp;gt;obd_recovery_timer,
                              cfs_time_shift(drt));
        }     
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;When &quot;to&quot;(recovery_timeout) will be limited by obd_recovery_time_hard, timer will be armed to (now+duration) whereas it must be armed to (recovery_start + to). I suppose following:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;        if (obd-&amp;gt;obd_recovery_timeout &amp;lt; to ||
            obd-&amp;gt;obd_recovery_timeout == obd-&amp;gt;obd_recovery_time_hard) {
                obd-&amp;gt;obd_recovery_timeout = to;
                end = obd-&amp;gt;obd_recovery_start + to;
                cfs_timer_arm(&amp;amp;obd-&amp;gt;obd_recovery_timer, end);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But even if upper problem will be fixed, recovery will not be aborted when recovery_timeout &amp;gt;= time_hard.&lt;br/&gt;
Possible we should set obd_abort_recovery to 1 when recovery_time_hard is reached.  &lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;--- a/lustre/ldlm/ldlm_lib.c
+++ b/lustre/ldlm/ldlm_lib.c
@@ -1793,6 +1793,12 @@ static int target_recovery_overseer(struct obd_device *obd,
                                    int (*health_check)(struct obd_export *))
 {
 repeat:
+       if (cfs_time_current_sec() &amp;gt;=
+           (obd-&amp;gt;obd_recovery_start + obd-&amp;gt;obd_recovery_time_hard)) {
+               CWARN(&quot;recovery is aborted by hard timeout\n&quot;);
+               obd-&amp;gt;obd_abort_recovery = 1;
+       }
+
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Another problem is that server_cacl_timeout rewrites obd_recovery_time_hard, so we can&apos;t use proc interface to set recovery_time_hard.&lt;/p&gt;</description>
                <environment></environment>
        <key id="21496">LU-4119</key>
            <summary>recovery time hard doesn&apos;t limit recovery duration</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bogl">Bob Glossman</assignee>
                                    <reporter username="scherementsev">Sergey Cheremencev</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Fri, 18 Oct 2013 09:00:20 +0000</created>
                <updated>Wed, 11 Mar 2015 16:41:15 +0000</updated>
                            <resolved>Tue, 6 Jan 2015 14:59:18 +0000</resolved>
                                    <version>Lustre 2.6.0</version>
                    <version>Lustre 2.5.4</version>
                                    <fixVersion>Lustre 2.7.0</fixVersion>
                    <fixVersion>Lustre 2.5.4</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>9</watches>
                                                                            <comments>
                            <comment id="69324" author="sergey" created="Fri, 18 Oct 2013 19:19:39 +0000"  >&lt;p&gt;Upper fix is wrong: we need to check that recovery has started:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;--- a/lustre/ldlm/ldlm_lib.c
+++ b/lustre/ldlm/ldlm_lib.c
@@ -1793,6 +1793,11 @@ static int target_recovery_overseer(struct obd_device *obd,
                                    int (*health_check)(struct obd_export *))
 {
 repeat:
+       if ((obd-&amp;gt;obd_recovery_start != 0) &amp;amp;&amp;amp; (cfs_time_current_sec() &amp;gt;=
+             (obd-&amp;gt;obd_recovery_start + obd-&amp;gt;obd_recovery_time_hard))) {
+               CWARN(&quot;recovery is aborted by hard timeout\n&quot;);
+               obd-&amp;gt;obd_abort_recovery = 1;
+       }&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="75985" author="sergey" created="Fri, 31 Jan 2014 14:18:10 +0000"  >&lt;p&gt;Patch with fix&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/9078/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/9078/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="87282" author="cliffw" created="Mon, 23 Jun 2014 16:01:55 +0000"  >&lt;p&gt;The patch is currently failing in our testing, and there are some review comments, is an updated patch possible?&lt;/p&gt;</comment>
                            <comment id="87305" author="sergey" created="Mon, 23 Jun 2014 19:28:13 +0000"  >&lt;p&gt;I saw Mike Pershin set +1 and wrote comments. I replied him and also asked a question. I thought we also need Andreas Dilger review to continue process.&lt;/p&gt;

&lt;p&gt;Sorry, but i can&apos;t find logs for failed test. When i see into this link &lt;a href=&quot;https://maloo.whamcloud.com/test_sets/811df926-da54-11e3-a2f8-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/test_sets/811df926-da54-11e3-a2f8-52540035b04c&lt;/a&gt; last test name is &quot;test_53a&quot;.&lt;br/&gt;
Could you please provide link to failed test ?&lt;br/&gt;
Sure, i will update patch when will get failed test logs. But this week i am in PTO and could do it only next week.&lt;/p&gt;</comment>
                            <comment id="88859" author="cliffw" created="Fri, 11 Jul 2014 19:05:10 +0000"  >&lt;p&gt;There may be some issues with getting old test logs due to the change in test systems. If you can submit an updated patch, we can retest and go from there. &lt;/p&gt;</comment>
                            <comment id="94515" author="cliffw" created="Fri, 19 Sep 2014 17:10:30 +0000"  >&lt;p&gt;Updated patch &lt;a href=&quot;http://review.whamcloud.com/#/c/9078/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/9078/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="100624" author="gerrit" created="Thu, 4 Dec 2014 02:27:27 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/9078/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/9078/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4119&quot; title=&quot;recovery time hard doesn&amp;#39;t limit recovery duration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4119&quot;&gt;&lt;del&gt;LU-4119&lt;/del&gt;&lt;/a&gt; ldlm: abort recovery by time_hard&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: df89c74a320278acac7466a83393af6abd99932b&lt;/p&gt;</comment>
                            <comment id="102628" author="pjones" created="Tue, 6 Jan 2015 14:59:18 +0000"  >&lt;p&gt;Landed for 2.7&lt;/p&gt;</comment>
                            <comment id="103354" author="gerrit" created="Tue, 13 Jan 2015 18:22:47 +0000"  >&lt;p&gt;James Simmons (uja.ornl@gmail.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/13381&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13381&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4119&quot; title=&quot;recovery time hard doesn&amp;#39;t limit recovery duration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4119&quot;&gt;&lt;del&gt;LU-4119&lt;/del&gt;&lt;/a&gt; ldlm: abort recovery by time_hard&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_5&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: ade5834f5e1a6419cea47edcede01ca2f5ea79a0&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="26957">LU-5724</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="27800">LU-5986</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="28058">LU-6084</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw633:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>11111</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>