<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:57:03 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12949] extend_recovery_timer assertion</title>
                <link>https://jira.whamcloud.com/browse/LU-12949</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The situation is next&lt;br/&gt;
1. MDT0 stared recovery, and was waiting a first connection&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;18123.755404&amp;#93;&lt;/span&gt;&#160;Lustre: testfs-MDT0000: in recovery but waiting for the first client to connect&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;2. It also was trying to communicate with MDT1 to get logs&lt;br/&gt;
3. failover of MDT0 was started&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;18291.574217&amp;#93;&lt;/span&gt;&#160;Lustre: Failing over testfs-MDT0000&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;4. lod thread (which communicates with MDT1) saw obd_stopping, stopped with EIO (-5)&#160;&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;18291.594477&amp;#93;&lt;/span&gt;&#160;LustreError: 3215:0:(lod_dev.c:434:lod_sub_recovery_thread()) testfs-MDT0001-osp-MDT0000 get update log failed: rc = -5&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;5. recovery thread called check_for_recovery_ready() function and asserted cause it thought that a client was connected and it didn&apos;t see a time stamp for a obd recovery.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[ 985.709865] Lustre: Failing over lustre-MDT0000
[ 990.467906] LustreError: 9090:0:(ldlm_lib.c:1754:extend_recovery_timer()) ASSERTION( obd-&amp;gt;obd_recovery_start != 0 ) failed:
[ 990.469985] LustreError: 9090:0:(ldlm_lib.c:1754:extend_recovery_timer()) LBUG
[ 990.471056] Pid: 9090, comm: tgt_recover_0 3.10.0-693.21.1.x3.1.9.x86_64 #1 SMP Tue Jun 26 09:38:31 PDT 2018
[ 990.471059] Call Trace:
[ 990.471105] [&amp;lt;ffffffff8103a212&amp;gt;] save_stack_trace_tsk+0x22/0x40
[ 990.471115] [&amp;lt;ffffffffc062c7cc&amp;gt;] libcfs_call_trace+0x8c/0xc0 [libcfs]
[ 990.471130] [&amp;lt;ffffffffc062c87c&amp;gt;] lbug_with_loc+0x4c/0xa0 [libcfs]
[ 990.471143] [&amp;lt;ffffffffc0aa70d9&amp;gt;] extend_recovery_timer+0x2a9/0x2c0 [ptlrpc]
[ 990.471205] [&amp;lt;ffffffffc0aa7c54&amp;gt;] check_for_recovery_ready+0xa4/0x1f0 [ptlrpc]
[ 990.471266] [&amp;lt;ffffffffc0aa969b&amp;gt;] target_recovery_overseer+0x26b/0x6f0 [ptlrpc]
[ 990.471326] [&amp;lt;ffffffffc0ab1f8c&amp;gt;] target_recovery_thread+0x68c/0x11d0 [ptlrpc]
[ 990.471385] [&amp;lt;ffffffff810b4031&amp;gt;] kthread+0xd1/0xe0
[ 990.471391] [&amp;lt;ffffffff816c1577&amp;gt;] ret_from_fork+0x77/0xb0
[ 990.471398] [&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff
 &lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;While I was working on reproducer I&apos;ve found conditions which are required for this bug.&lt;/p&gt;

&lt;p&gt;1) MDT should have only lwp clients during recovery phase&lt;/p&gt;

&lt;p&gt;2) MDT needs to get error from MDT-MDT log update&lt;/p&gt;

&lt;p&gt;3) 60sec timer should wakeup recovery thread to call&#160;check_for_recovery_ready(), and in a moment before, umount should invalidate import to produce stale clients&lt;/p&gt;</description>
                <environment></environment>
        <key id="57332">LU-12949</key>
            <summary>extend_recovery_timer assertion</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="aboyko">Alexander Boyko</assignee>
                                    <reporter username="aboyko">Alexander Boyko</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Thu, 7 Nov 2019 08:57:49 +0000</created>
                <updated>Mon, 22 Aug 2022 14:57:11 +0000</updated>
                            <resolved>Sat, 14 Dec 2019 13:43:24 +0000</resolved>
                                                    <fixVersion>Lustre 2.14.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="257909" author="aboyko" created="Thu, 7 Nov 2019 11:35:03 +0000"  >&lt;p&gt;I&apos;ve pushed a fix &lt;a href=&quot;https://review.whamcloud.com/#/c/36703&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/36703&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="259843" author="gerrit" created="Sat, 14 Dec 2019 05:57:12 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/36703/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36703/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12949&quot; title=&quot;extend_recovery_timer assertion&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12949&quot;&gt;&lt;del&gt;LU-12949&lt;/del&gt;&lt;/a&gt; obdclass: don&apos;t extend timer if obd stops&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: bc871f8ff53068bfe69ad7653479b42e6a6d2d93&lt;/p&gt;</comment>
                            <comment id="259880" author="pjones" created="Sat, 14 Dec 2019 13:43:24 +0000"  >&lt;p&gt;Landed for 2.14&lt;/p&gt;</comment>
                            <comment id="293829" author="gerrit" created="Wed, 3 Mar 2021 21:20:45 +0000"  >&lt;p&gt;Gian-Carlo DeFazio (defazio1@llnl.gov) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/41870&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/41870&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12949&quot; title=&quot;extend_recovery_timer assertion&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12949&quot;&gt;&lt;del&gt;LU-12949&lt;/del&gt;&lt;/a&gt; obdclass: run recovery-small test_138 on branch 2_12&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: b64c63c248e7c518ff9d49cd0b458dd0584e523b&lt;/p&gt;</comment>
                            <comment id="344240" author="gerrit" created="Mon, 22 Aug 2022 14:57:11 +0000"  >&lt;p&gt;&quot;Etienne AUJAMES &amp;lt;eaujames@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/48283&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/48283&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12949&quot; title=&quot;extend_recovery_timer assertion&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12949&quot;&gt;&lt;del&gt;LU-12949&lt;/del&gt;&lt;/a&gt; obdclass: don&apos;t extend timer if obd stops&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: d7619bd1049c5891632f1f532fd21869efb2f39b&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00p4v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>