<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:36:07 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-10553] d23b.replay-dual: Directory not empty, FAIL: remove sub-test dirs failed</title>
                <link>https://jira.whamcloud.com/browse/LU-10553</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;This issue was created by maloo for Cliff White &amp;lt;cliff.white@intel.com&amp;gt;&lt;/p&gt;

&lt;p&gt;This issue relates to the following test suite run: &lt;/p&gt;

&lt;p&gt;On multiple runs, we see permission errors when cleaning up the test, the files showing in the error report appear to be artifacts from previous (replay-*) tests. &lt;br/&gt;
Examples:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;== sanity-pfl test complete, duration 777 sec ======================================================== 02:29:35 (1516357775)
rm: cannot remove &lt;span class=&quot;code-quote&quot;&gt;&apos;/mnt/lustre/d23b.replay-dual&apos;&lt;/span&gt;: Directory not empty
....
== sanity-pfl test complete, duration 773 sec ======================================================== 00:38:53 (1516437533)
rm: cannot remove &lt;span class=&quot;code-quote&quot;&gt;&apos;/mnt/lustre/f4h.replay-vbr&apos;&lt;/span&gt;: Operation not permitted
 sanity-pfl : @@@@@@ FAIL: remove sub-test dirs failed 
...
== sanity-pfl test complete, duration 773 sec ======================================================== 12:42:04 (1516567324)
rm: cannot remove &lt;span class=&quot;code-quote&quot;&gt;&apos;/mnt/lustre/f4h.replay-vbr&apos;&lt;/span&gt;: Operation not permitted
 sanity-pfl : @@@@@@ FAIL: remove sub-test dirs failed 
...
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment></environment>
        <key id="50363">LU-10553</key>
            <summary>d23b.replay-dual: Directory not empty, FAIL: remove sub-test dirs failed</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="maloo">Maloo</reporter>
                        <labels>
                            <label>easy</label>
                            <label>tests</label>
                    </labels>
                <created>Tue, 23 Jan 2018 20:54:02 +0000</created>
                <updated>Mon, 27 Mar 2023 04:12:55 +0000</updated>
                                            <version>Lustre 2.11.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="218969" author="adilger" created="Wed, 24 Jan 2018 03:57:36 +0000"  >&lt;p&gt;Typically, scripts like sanity.sh will clean up test files at the start to avoid issues like this. Also, most test scripts should only be accessing files that they created, so there may be some cleanup work needed in these sanity-pfl tests. &lt;/p&gt;</comment>
                            <comment id="219394" author="jamesanunez" created="Mon, 29 Jan 2018 22:12:52 +0000"  >&lt;p&gt;I started to open a new ticket until I saw this ticket. Here is just a little more detail on what we see in Maloo for these failed test sessions. &lt;/p&gt;

&lt;p&gt;Lustre test suites fail because &#8220;rm: cannot remove &apos;/mnt/lustre/f4h.replay-vbr&apos;: Operation not permitted&#8221;&lt;/p&gt;

&lt;p&gt;We have many cases of a Lustre test suite have a FAIL status, but, when you look at all the subtests, all of the subtests PASS. By looking at the end of the  suite_log for the failed test suite, you will see an error when the suite tries to clean up the file system. For example (&lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/14461086-0359-11e8-bd00-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/14461086-0359-11e8-bd00-52540065bddc&lt;/a&gt; ), a recent insanity test suite failure:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;== insanity test complete, duration 1255 sec ========================================================= 19:55:20 (1517025320)
rm: cannot remove &apos;/mnt/lustre/f4h.replay-vbr&apos;: Operation not permitted
 insanity : @@@@@@ FAIL: remove sub-test dirs failed 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:5336:error()
  = /usr/lib64/lustre/tests/test-framework.sh:4830:check_and_cleanup_lustre()
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the example above, we can&#8217;t clean up (rm) the files in the file system because a file remains. Yet, I don&#8217;t know why we would get an &#8220;Operation not permitted&#8221; when trying to delete a file. When one test suite completes and another starts, there should not be any tasks running from previous test suites. The solution may be related/similar to the patch for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6609&quot; title=&quot;recovery-small test_26a : FAIL: remove sub-test dirs failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6609&quot;&gt;&lt;del&gt;LU-6609&lt;/del&gt;&lt;/a&gt;; &lt;a href=&quot;https://review.whamcloud.com/#/c/14843&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/14843&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;In the same test session referenced above, sanity-quota cannot clean up the file system and sanity-pfl, lustre-rsync-test, metadata-updates, ost-pools, mds-survey, performance-sanity, parallel-scale, large-scale, and obdfilter-survey fails due to the f4h.replay-vbr file. &lt;/p&gt;

&lt;p&gt;Looking at the replay-vbr results, we see that replay-vbr test 4h did fail and, looking at relay-vbr suite_log, we see that replay-vbr, couldn&#8217;t remove that file:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;== replay-vbr test complete, duration 890 sec ======================================================== 19:34:23 (1517024063)
replay-vbr: FAIL: test_1b trevis-7vm9 not evicted
replay-vbr: FAIL: test_2b trevis-7vm9 not evicted
replay-vbr: FAIL: test_3b trevis-7vm9 not evicted
replay-vbr: FAIL: test_4c trevis-7vm9 not evicted
replay-vbr: FAIL: test_4d trevis-7vm9 not evicted
replay-vbr: FAIL: test_4f trevis-7vm9 not evicted
replay-vbr: FAIL: test_4h trevis-7vm9 not evicted
replay-vbr: FAIL: test_5b trevis-7vm9 not evicted
replay-vbr: FAIL: test_5c trevis-7vm9 not evicted
replay-vbr: FAIL: test_6c trevis-7vm9 not evicted
replay-vbr: FAIL: test_6d trevis-7vm9 not evicted
replay-vbr: FAIL: test_7a Test 7a.1 failed
replay-vbr: FAIL: test_7b Test 7b.1 failed
replay-vbr: FAIL: test_7c Test 7c.1 failed
replay-vbr: FAIL: test_7e Test 7e.1 failed
replay-vbr: FAIL: test_7f Test 7f.1 failed
replay-vbr: FAIL: test_7h Test 7h.1 failed
replay-vbr: FAIL: test_7i Test 7i.1 failed
replay-vbr: FAIL: test_10b trevis-7vm9:/mnt/lustre not evicted
replay-vbr: FAIL: test_12a test_12a failed with 4
rm: cannot remove &apos;/mnt/lustre/f4h.replay-vbr&apos;: Operation not permitted
 replay-vbr : @@@@@@ FAIL: remove sub-test dirs failed 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:5336:error()
  = /usr/lib64/lustre/tests/test-framework.sh:4830:check_and_cleanup_lustre()
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Lustre test suites fail because &#8220;rm: cannot remove &apos;/mnt/lustre/d81d.replay-single&apos;: Directory not empty&#8221;&lt;/p&gt;

&lt;p&gt;We have many cases of a Lustre test suite FAIL testing, but, when you look at all the subtests, all of the subtests PASS. By looking at the end of the  suite_log for the failed test suite, you will see an error when the suite tries to clean up the file system. For example (&lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/294da78c-0363-11e8-a10a-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/294da78c-0363-11e8-a10a-52540065bddc&lt;/a&gt;), a recent insanity test suite failure:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;== insanity test complete, duration 2289 sec ========================================================= 20:59:08 (1517029148)
rm: cannot remove &apos;/mnt/lustre/d81d.replay-single&apos;: Directory not empty
 insanity : @@@@@@ FAIL: remove sub-test dirs failed 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:5336:error()
  = /usr/lib64/lustre/tests/test-framework.sh:4830:check_and_cleanup_lustre()
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Looking at replay-single, test 81d does fail and we get the same error message when trying to clean up the file system at the end of the test suite:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;== replay-single test complete, duration 7320 sec ==================================================== 20:20:35 (1517026835)
replay-single: FAIL: test_0c File exists and it shouldn&apos;t
replay-single: FAIL: test_44c unliked after fail abort
replay-single: FAIL: test_80d /usr/bin/lfs getstripe -M /mnt/lustre/d80d.replay-single/remote_dir failed
replay-single: FAIL: test_81d rmdir failed
replay-single: FAIL: test_120 dir-0 still exists
rm: cannot remove &apos;/mnt/lustre/d81d.replay-single&apos;: Directory not empty
 replay-single : @@@@@@ FAIL: remove sub-test dirs failed 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the same test session referenced above, recovery-small,  replay-ost-single, replay-dual, replay-vbr, sanity-quota, sanity-pfl, lustre-rsync-test, metadata-updates, ost-pools, mds-survey, performance-sanity, parallel-scale, large-scale, and obdfilter-survey all are unable to remove the f4h.replay-vbr file and some of those tests fail solely due to this. &lt;/p&gt;

</comment>
                            <comment id="220106" author="yujian" created="Tue, 6 Feb 2018 06:47:36 +0000"  >&lt;p&gt;After replay-dual test 23b failed as follows:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;CMD: onyx-40vm2 mkdir /mnt/lustre2/d23b.replay-dual/remote_dir
onyx-40vm2: mkdir: cannot create directory `/mnt/lustre2/d23b.replay-dual/remote_dir&apos;: File exists
 replay-dual test_23b: @@@@@@ FAIL: Remote creation failed 1 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The following test suites failed with:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;rm: cannot remove `/mnt/lustre/d23b.replay-dual&apos;: Directory not empty
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href=&quot;https://testing.hpdd.intel.com/test_sessions/b7b66042-d0a6-4a0a-8c96-d3dc4110333f&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sessions/b7b66042-d0a6-4a0a-8c96-d3dc4110333f&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="233565" author="adilger" created="Sat, 15 Sep 2018 00:51:13 +0000"  >&lt;p&gt;I guess we need to update the test suite to reboot (reformat?) in such cases, so at least we only have one test script failing instead of a whole series.&lt;/p&gt;</comment>
                            <comment id="258251" author="gerrit" created="Wed, 13 Nov 2019 18:41:54 +0000"  >&lt;p&gt;James Nunez (jnunez@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36747&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36747&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10553&quot; title=&quot;d23b.replay-dual: Directory not empty, FAIL: remove sub-test dirs failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10553&quot;&gt;LU-10553&lt;/a&gt; tests: create and cleanup test specific working dir&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: b3a63eb54d40d414be2d03303ce63ab4832440cc&lt;/p&gt;</comment>
                            <comment id="258330" author="adilger" created="Thu, 14 Nov 2019 21:57:37 +0000"  >&lt;p&gt;I don&apos;t think this is a problem from the test-framework.sh not trying to delete the test directories, but rather a defect in Lustre/DNE where the directory simply &lt;b&gt;cannot&lt;/b&gt; be deleted because it has a file in it that is not visible on the client for some reason.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="69514">LU-15710</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="47648">LU-9827</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzzrkn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>