<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:55:29 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12769] replay-dual test 0b hangs in client mount</title>
                <link>https://jira.whamcloud.com/browse/LU-12769</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;replay-dual test_0b hangs in when the client tries to mount the file system. Looking at the logs at &lt;a href=&quot;https://testing.whamcloud.com/test_sets/2165cd40-d70d-11e9-9fc9-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/2165cd40-d70d-11e9-9fc9-52540065bddc&lt;/a&gt;, the last lines seen in the suite_log are:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Started lustre-MDT0000
Starting client: trevis-16vm11.trevis.whamcloud.com:  -o user_xattr,flock trevis-20vm9@tcp:/lustre /mnt/lustre
CMD: trevis-16vm11.trevis.whamcloud.com mkdir -p /mnt/lustre
CMD: trevis-16vm11.trevis.whamcloud.com mount -t lustre -o user_xattr,flock trevis-20vm9@tcp:/lustre /mnt/lustre
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;On the MDS (vm11), we see the following in the console log&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[ 1298.607167] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2&amp;gt;/dev/null
[ 1309.346832] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:43
[ 1314.381684] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:38
[ 1314.383960] Lustre: Skipped 1 previous similar message
[ 1319.501601] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:33
[ 1324.621716] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:28
[ 1329.741784] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:23
[ 1339.982308] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 0 evicted) to recover in 0:12
[ 1339.984388] Lustre: Skipped 1 previous similar message
[ 1352.877227] Lustre: lustre-MDT0000: recovery is timed out, evict stale exports
[ 1352.878085] Lustre: lustre-MDT0000: disconnecting 1 stale clients
[ 1360.462321] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) to recover in 0:27
[ 1360.464241] Lustre: Skipped 3 previous similar messages
[ 1396.303505] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 0:07
[ 1396.305383] Lustre: Skipped 6 previous similar messages
[ 1462.863612] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 1:14
[ 1462.865659] Lustre: Skipped 12 previous similar messages
[ 1590.865592] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 3:22
[ 1590.867668] Lustre: Skipped 24 previous similar messages
[ 1846.869597] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 7:38
[ 1846.871619] Lustre: Skipped 49 previous similar messages
[ 2358.877590] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 16:10
[ 2358.879608] Lustre: Skipped 99 previous similar messages
[ 2963.046705] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 26:14
[ 2963.048899] Lustre: Skipped 117 previous similar messages
[ 3567.216150] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 36:18
[ 3567.218114] Lustre: Skipped 117 previous similar messages
[ 4171.385464] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 46:23
[ 4171.387400] Lustre: Skipped 117 previous similar messages
[ 4775.554195] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 56:27
[ 4775.556078] Lustre: Skipped 117 previous similar messages
[ 5379.723493] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 66:31
[ 5379.725448] Lustre: Skipped 117 previous similar messages
[ 5983.892872] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 76:35
[ 5983.894832] Lustre: Skipped 117 previous similar messages
[ 6588.062279] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 86:39
[ 6588.064310] Lustre: Skipped 117 previous similar messages
[ 7192.231245] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 96:43
[ 7192.233232] Lustre: Skipped 117 previous similar messages
[ 7796.400758] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 106:48
[ 7796.403030] Lustre: Skipped 117 previous similar messages
[ 8400.570060] Lustre: lustre-MDT0000: Denying connection for new client 26b5e8fc-98f4-4 (at 10.9.4.195@tcp), waiting for 4 known clients (0 recovered, 3 in progress, and 1 evicted) already passed deadline 116:52
[ 8400.572101] Lustre: Skipped 117 previous similar messages
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;On the OSS (vm4) console log, we see&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[16003.344264] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-20vm9.trevis.whamcloud.com: executing set_default_debug -1 all 4
[16003.724818] Lustre: DEBUG MARKER: trevis-20vm9.trevis.whamcloud.com: executing set_default_debug -1 all 4
[16009.720340] Lustre: Evicted from MGS (at 10.9.4.245@tcp) after server handle changed from 0xb142ad53c4c0bc64 to 0xb142ad53c4c0d363
[16014.732310] LustreError: 10096:0:(ldlm_resource.c:1147:ldlm_resource_complain()) lustre-MDT0000-lwp-OST0001: namespace resource [0x200000006:0x20000:0x0].0x0 (ffff8f95b862c0c0) refcount nonzero (1) after lock cleanup; forcing cleanup.
[16014.735727] LustreError: 10096:0:(ldlm_resource.c:1147:ldlm_resource_complain()) Skipped 1 previous similar message
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This may be related to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11038&quot; title=&quot;replay-dual test_26: MDS crash with BUG: unable to handle kernel NULL pointer dereference &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11038&quot;&gt;LU-11038&lt;/a&gt; and/or &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12340&quot; title=&quot;replay-dual test 0b timeouts&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12340&quot;&gt;&lt;del&gt;LU-12340&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</description>
                <environment></environment>
        <key id="56919">LU-12769</key>
            <summary>replay-dual test 0b hangs in client mount</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bzzz">Alex Zhuravlev</assignee>
                                    <reporter username="jamesanunez">James Nunez</reporter>
                        <labels>
                    </labels>
                <created>Mon, 16 Sep 2019 19:11:43 +0000</created>
                <updated>Fri, 3 Jan 2020 23:52:20 +0000</updated>
                            <resolved>Wed, 9 Oct 2019 22:43:45 +0000</resolved>
                                    <version>Lustre 2.13.0</version>
                    <version>Lustre 2.12.4</version>
                                    <fixVersion>Lustre 2.13.0</fixVersion>
                    <fixVersion>Lustre 2.12.4</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="255032" author="bzzz" created="Thu, 19 Sep 2019 08:44:35 +0000"  >&lt;p&gt;in my local testing (single VM) replay-dual 0a timeouts very frequently&lt;/p&gt;</comment>
                            <comment id="255253" author="gerrit" created="Mon, 23 Sep 2019 12:21:44 +0000"  >&lt;p&gt;Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36274&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36274&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12769&quot; title=&quot;replay-dual test 0b hangs in client mount&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12769&quot;&gt;&lt;del&gt;LU-12769&lt;/del&gt;&lt;/a&gt; recovery: restart recovery timer&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: dd856535f4b98f4edee42a504c62a2766c1557c6&lt;/p&gt;</comment>
                            <comment id="255266" author="adilger" created="Mon, 23 Sep 2019 15:34:46 +0000"  >&lt;p&gt;The broken code fixed by Alex&apos;s patch was introduced by commit v2_12_53-53-g9334f1d512 (patch &lt;a href=&quot;https://review.whamcloud.com/34710&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34710&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11771&quot; title=&quot;bad output in target_handle_reconnect: Recovery already passed deadline 71578:57&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11771&quot;&gt;&lt;del&gt;LU-11771&lt;/del&gt;&lt;/a&gt; ldlm: use hrtimer for recovery to fix timeout messages&lt;/tt&gt;&quot; so needs to be fixed for 2.13.&lt;/p&gt;</comment>
                            <comment id="255268" author="simmonsja" created="Mon, 23 Sep 2019 16:26:38 +0000"  >&lt;p&gt;Actually this was broken before &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11771&quot; title=&quot;bad output in target_handle_reconnect: Recovery already passed deadline 71578:57&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11771&quot;&gt;&lt;del&gt;LU-11771&lt;/del&gt;&lt;/a&gt;. This is a duplicate of&#160;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11762&quot; class=&quot;external-link&quot; rel=&quot;nofollow&quot;&gt;https://jira.whamcloud.com/browse/LU-11762&lt;/a&gt;. The landing of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11771&quot; title=&quot;bad output in target_handle_reconnect: Recovery already passed deadline 71578:57&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11771&quot;&gt;&lt;del&gt;LU-11771&lt;/del&gt;&lt;/a&gt; reduced how often this bug showed up to the point I couldn&apos;t reproduce it to fix it. It might be good to land&#160;&lt;a href=&quot;https://review.whamcloud.com/#/c/35627/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/35627&lt;/a&gt;&#160;as well. We will need this for 2.12 as well.&lt;/p&gt;</comment>
                            <comment id="255690" author="simmonsja" created="Tue, 1 Oct 2019 00:54:00 +0000"  >&lt;p&gt;Talking with Alex this appears to a problem of clock drift which would explain why older tickets similar to this exist. The reason being that jiffies also experiences clock drift as well. In the case of the hrtimer the clock drift appears to only happen when using the high resolution wall clock on VMs. It appears using the high resolution monotonic clock avoids this problem.&lt;/p&gt;</comment>
                            <comment id="256152" author="gerrit" created="Wed, 9 Oct 2019 22:35:31 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/36274/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36274/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12769&quot; title=&quot;replay-dual test 0b hangs in client mount&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12769&quot;&gt;&lt;del&gt;LU-12769&lt;/del&gt;&lt;/a&gt; recovery: use monotonic timer&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 06408a4ef381121fa58783026a0cf0a6b0fa479c&lt;/p&gt;</comment>
                            <comment id="256159" author="pjones" created="Wed, 9 Oct 2019 22:43:45 +0000"  >&lt;p&gt;Landed for 2.13&lt;/p&gt;</comment>
                            <comment id="259237" author="gerrit" created="Thu, 5 Dec 2019 19:06:57 +0000"  >&lt;p&gt;Minh Diep (mdiep@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36937&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36937&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12769&quot; title=&quot;replay-dual test 0b hangs in client mount&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12769&quot;&gt;&lt;del&gt;LU-12769&lt;/del&gt;&lt;/a&gt; recovery: use monotonic timer&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: e46386159b7afe349bd90e845d69adafb1ab62fc&lt;/p&gt;</comment>
                            <comment id="260581" author="gerrit" created="Fri, 3 Jan 2020 23:41:43 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/36937/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36937/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12769&quot; title=&quot;replay-dual test 0b hangs in client mount&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12769&quot;&gt;&lt;del&gt;LU-12769&lt;/del&gt;&lt;/a&gt; recovery: use monotonic timer&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: cf4cdefec8145b2b7a2eb4c2de4e1e08ebc862b3&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="54262">LU-11762</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="54282">LU-11771</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="51976">LU-10950</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="55753">LU-12340</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00mtb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>