<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:29:32 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-16733] recovery-small: cannot remove &apos;/mnt/lustre/d110h.recovery-small&apos;</title>
                <link>https://jira.whamcloud.com/browse/LU-16733</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;This issue was created by maloo for Andreas Dilger &amp;lt;adilger@whamcloud.com&amp;gt;&lt;/p&gt;

&lt;p&gt;This issue relates to the following test suite run: &lt;a href=&quot;https://testing.whamcloud.com/test_sets/d38d511e-7fdf-4ab8-bca4-a3f9d540464f&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/d38d511e-7fdf-4ab8-bca4-a3f9d540464f&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;The test session reports &quot;&lt;tt&gt;No sub tests failed in this test set.&lt;/tt&gt;&quot;&lt;/p&gt;

&lt;p&gt;Test session details:&lt;br/&gt;
clients: &lt;a href=&quot;https://build.whamcloud.com/job/lustre-reviews/93288&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://build.whamcloud.com/job/lustre-reviews/93288&lt;/a&gt; - 4.18.0-348.7.1.el8_5.x86_64&lt;br/&gt;
servers: &lt;a href=&quot;https://build.whamcloud.com/job/lustre-reviews/93288&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://build.whamcloud.com/job/lustre-reviews/93288&lt;/a&gt; - 4.18.0-348.23.1.el8_lustre.x86_64&lt;/p&gt;

&lt;p&gt;Have seen this failure in a few different patches, unable to clean up at the end:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;== recovery-small test complete, duration 7270 sec ======= 11:52:10 (1679917930)
rm: cannot remove &apos;/mnt/lustre/d110h.recovery-small&apos;: Input/output error
 recovery-small : @@@@@@ FAIL: remove sub-test dirs failed 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I also saw it with &lt;tt&gt;d110i.recovery-small&lt;/tt&gt;&lt;/p&gt;</description>
                <environment></environment>
        <key id="75540">LU-16733</key>
            <summary>recovery-small: cannot remove &apos;/mnt/lustre/d110h.recovery-small&apos;</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="flei">Feng Lei </assignee>
                                    <reporter username="maloo">Maloo</reporter>
                        <labels>
                    </labels>
                <created>Wed, 12 Apr 2023 15:56:24 +0000</created>
                <updated>Tue, 20 Jun 2023 17:22:42 +0000</updated>
                            <resolved>Tue, 20 Jun 2023 17:22:42 +0000</resolved>
                                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="369458" author="adilger" created="Fri, 14 Apr 2023 02:33:42 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=flei&quot; class=&quot;user-hover&quot; rel=&quot;flei&quot;&gt;flei&lt;/a&gt;&#160;can you please check if there is some patch that landed recently that is causing this to be hit (or hit more frequently)?&lt;/p&gt;

&lt;p&gt;It looks like the first (recent) hit was&#160;2023-03-27 (ver 2.15.54.114) but on a patch that hasn&apos;t landed yet. There was also a single hit on 2023-01-19 (ver 2.15.53.56 full testing, so no patch), but it complained about d110j.  The problem has definitely been hit much more recently since 2023-04-04. &lt;a href=&quot;https://testing.whamcloud.com/search?client_branch_type_id=24a6947e-04a9-11e1-bb5f-52540025f9af&amp;amp;server_branch_type_id=24a6947e-04a9-11e1-bb5f-52540025f9af&amp;amp;query_bugs=Lu-16733&amp;amp;status%5B%5D=FAIL&amp;amp;test_set_script_id=f36cabd0-32c3-11e0-a61c-52540025f9ae&amp;amp;start_date=2023-03-27&amp;amp;end_date=2023-04-13&amp;amp;source=test_sets#redirect&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;This Maloo search shows all of the failures tagged with LU-16733&lt;/a&gt;, since it isn&apos;t otherwise possible to search for &quot;no failure&quot;, at least until patch &lt;a href=&quot;https://review.whamcloud.com/49582&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/49582&lt;/a&gt; lands. &lt;/p&gt;

&lt;p&gt;The patches landed after 2023-03-26 and before 2023-03-30 are:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# git log --after 2023-03-25 --before 2023-03-30 --oneline
7c52cbf65218 LU-16515 tests: disable sanity test_118c/118d
a7222127c7a6 LU-16642 tests: improve sanity-sec test_61
8f40a3d7110d LU-16639 misc: cleanup concole messages
e998d21caf99 LU-16589 tests: add sanity/31l to test ln command
17bbf5bdd6f9 LU-930 docs: fix whatis output
36cbba150bce LU-16632 tests: more margin of error for sanity/56xh
91a3726f313d LU-16633 obdclass: fix rpc slot leakage
12c34651994b LU-14291 batch: don&apos;t include lustre_update.h for client only builds
d5b26443a3d3 LU-16615 utils: add messages in l_getidentity
b30f825232cb LU-16601 kernel: update SLES15 SP4 [5.14.21-150400.24.46.1]
8f004bc53b1a LU-16599 obdclass: job_stats can parse escaped jobid string
fc7a0d6013b4 LU-14668 lnet: add &apos;lock_prim_nid&quot; lnet module parameter
f5293fb66e79 LU-16598 osp: cleanup comment in osp_sync.c
5e24b374f7bd LU-16595 test: save one second in wait_destroy_complete()
da230373bd14 LU-16563 lnet: use discovered ni status to set initial health
0366422cfd1e LU-16221 kernel: update RHEL 9.1 [5.14.0-162.18.1.el9_1]
2d40d96b4ec8 LU-15053 tests: reset quota if ENABLE_QUOTA=1
7e893c70955d LU-16382 build: udev files in /usr/lib
b33808d3aebb LU-16338 readahead: clip readahead with kms
ccee6b92ec4d LU-13107 utils: remove duplicate lctl erase/fork_lcfg
2471d35c0e0e LU-16217 iokit: Add lst.sh wrapper and lst-survey
bdbc7f9f42b9 LU-12805 tests: disable replay-single/36
73ee638813a8 LU-16604 kfilnd: kfilnd_peer ref leak on send
6fab1fe4a5c5 LU-9680 lnet: handle multi-rail setups
0ecb2a167c56 LU-11912 ofd: reduce LUSTRE_DATA_SEQ_MAX_WIDTH
c97d4cdf4dc7 LU-16629 osd: refill the existing env
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I think there are a few approaches that could be used to debug this:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;check MDS, OSS, client, test logs around test_110h/i/j to see if something unusual is happening vs. non-failing runs. This might be difficult since there will already be errors due to the test itself&lt;/li&gt;
	&lt;li&gt;review debug logs from the test failure to see why the directory could not be removed&lt;/li&gt;
	&lt;li&gt;submit &quot;bisect&quot; patches at different points in the above patch list with &lt;tt&gt;Test-Parameters:&lt;/tt&gt; lines to run recovery-small enough times to be confident whether the bug is hit or not. It failed 11/256 runs in the past week, so it would need to run twice as many as average failure rate, about 46x, to be confident in the results. Each session takes about 2h to finish, so they should be run in parallel (one line of &quot;&lt;tt&gt;Test-Parameters: testlist=recovery-small mdscount=2 mtscount=4&lt;/tt&gt;&quot; per session). Since this will consume 46 test nodes per patch, better to do this one patch at a time, maybe more over the weekend.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="370603" author="adilger" created="Tue, 25 Apr 2023 20:59:41 +0000"  >&lt;p&gt;&quot;Feng Lei &amp;lt;flei@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/50683&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/50683&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16733&quot; title=&quot;recovery-small: cannot remove &amp;#39;/mnt/lustre/d110h.recovery-small&amp;#39;&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16733&quot;&gt;&lt;del&gt;LU-16733&lt;/del&gt;&lt;/a&gt; tests: wait for recovery done&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 2&lt;br/&gt;
Commit: 6b5c19493cbc8a186035f687d618003d06da0ef2&lt;/p&gt;</comment>
                            <comment id="375903" author="gerrit" created="Tue, 20 Jun 2023 03:40:57 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/50683/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/50683/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16733&quot; title=&quot;recovery-small: cannot remove &amp;#39;/mnt/lustre/d110h.recovery-small&amp;#39;&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16733&quot;&gt;&lt;del&gt;LU-16733&lt;/del&gt;&lt;/a&gt; tests: wait for recovery done&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 1512b6572e78442760e0caff50957061b2ca6617&lt;/p&gt;</comment>
                            <comment id="375994" author="pjones" created="Tue, 20 Jun 2023 17:22:42 +0000"  >&lt;p&gt;Landed for 2.16&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="75569">LU-16737</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i03iq7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>