<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:42:23 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-11265] recovery-mds-scale test failover_ost fails with &apos;test_failover_ost returned 1&apos; due to mkdir failure</title>
                <link>https://jira.whamcloud.com/browse/LU-11265</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;recovery-mds-scale test_failover_ost fails without failing over any OSTs due to mkdir failing. From the failover test session at &lt;a href=&quot;https://testing.whamcloud.com/test_sets/46c80c12-a1f1-11e8-a5f2-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/46c80c12-a1f1-11e8-a5f2-52540065bddc&lt;/a&gt;, we see the following at the end of the test_log&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2018-08-15 23:39:25 Terminating clients loads ...
Duration:               86400
Server failover period: 1200 seconds
Exited after:           0 seconds
Number of failovers before exit:
mds1: 0 times
ost1: 0 times
ost2: 0 times
ost3: 0 times
ost4: 0 times
ost5: 0 times
ost6: 0 times
ost7: 0 times
Status: FAIL: rc=1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;From the suite log, we see that the client job failed during the first OSS failover&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Started lustre-OST0006
==== Checking the clients loads AFTER failover -- failure NOT OK
Client load failed on node trevis-3vm7, rc=1
Client load failed during failover. Exiting...
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;On vm7, looking at the run_dd log, we can see that dd fails because a directory already exists&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2018-08-15 23:37:37: dd run starting
+ mkdir -p /mnt/lustre/d0.dd-trevis-3vm7.trevis.whamcloud.com
mkdir: cannot create directory &#8216;/mnt/lustre/d0.dd-trevis-3vm7.trevis.whamcloud.com&#8217;: File exists
+ /usr/bin/lfs setstripe -c -1 /mnt/lustre/d0.dd-trevis-3vm7.trevis.whamcloud.com
+ cd /mnt/lustre/d0.dd-trevis-3vm7.trevis.whamcloud.com
/usr/lib64/lustre/tests/run_dd.sh: line 34: cd: /mnt/lustre/d0.dd-trevis-3vm7.trevis.whamcloud.com: Not a directory
+ sync
++ df -P /mnt/lustre/d0.dd-trevis-3vm7.trevis.whamcloud.com
++ awk &apos;/:/ { print $4 }&apos;
+ FREE_SPACE=13349248
+ BLKS=1501790
+ echoerr &apos;Total free disk space is 13349248, 4k blocks to dd is 1501790&apos;
+ echo &apos;Total free disk space is 13349248, 4k blocks to dd is 1501790&apos;
Total free disk space is 13349248, 4k blocks to dd is 1501790
+ df /mnt/lustre/d0.dd-trevis-3vm7.trevis.whamcloud.com
+ dd bs=4k count=1501790 status=noxfer if=/dev/zero of=/mnt/lustre/d0.dd-trevis-3vm7.trevis.whamcloud.com/dd-file
dd: failed to open &#8216;/mnt/lustre/d0.dd-trevis-3vm7.trevis.whamcloud.com/dd-file&#8217;: Not a directory
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Looking at the run_dd log from the previous test, failover_mds, we do see that the file d0.dd-trevis-* is created and the client tar job is signaled right after the call to remove the file is issued&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;+ echo &apos;2018-08-15 23:37:07: dd succeeded&apos;
2018-08-15 23:37:07: dd succeeded
+ cd /tmp
+ rm -rf /mnt/lustre/d0.dd-trevis-3vm7.trevis.whamcloud.com
++ signaled
+++ date &apos;+%F %H:%M:%S&apos;
++ echoerr &apos;2018-08-15 23:37:41: client load was signaled to terminate&apos;
++ echo &apos;2018-08-15 23:37:41: client load was signaled to terminate&apos;
2018-08-15 23:37:41: client load was signaled to terminate
+++ ps -eo &apos;%c %p %r&apos;
+++ awk &apos;/ 18302 / {print $3}&apos;
++ local PGID=18241
++ kill -TERM -18241
++ sleep 5
++ kill -KILL -18241
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Maybe the job was killed before the file could be removed?&lt;/p&gt;</description>
                <environment></environment>
        <key id="53010">LU-11265</key>
            <summary>recovery-mds-scale test failover_ost fails with &apos;test_failover_ost returned 1&apos; due to mkdir failure</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="jamesanunez">James Nunez</reporter>
                        <labels>
                    </labels>
                <created>Fri, 17 Aug 2018 18:51:44 +0000</created>
                <updated>Fri, 19 Nov 2021 10:16:49 +0000</updated>
                                            <version>Lustre 2.10.5</version>
                    <version>Lustre 2.12.3</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="255446" author="jamesanunez" created="Thu, 26 Sep 2019 17:06:36 +0000"  >&lt;p&gt;We see this issue for recent versions of 2.12; &lt;a href=&quot;https://testing.whamcloud.com/test_sets/4d26e47c-dfec-11e9-a0ba-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/4d26e47c-dfec-11e9-a0ba-52540065bddc&lt;/a&gt; . When this happens, the test following recovery-mds-scale will fail even if all tests pass due to not being able to remove the contents of the test directory. &lt;/p&gt;

&lt;p&gt;For example (see test session &lt;a href=&quot;https://testing.whamcloud.com/test_sessions/40ec2f5f-9e8e-45a7-97c8-a84baf2e4b55&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sessions/40ec2f5f-9e8e-45a7-97c8-a84baf2e4b55&lt;/a&gt;), we see recovery-random-scale and recovery-double-scale fail due to &lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;== recovery-random-scale test complete, duration 85619 sec =========================================== 11:09:20 (1569409760)
rm: cannot remove &apos;/mnt/lustre/d0.tar-trevis-40vm9.trevis.whamcloud.com/etc&apos;: Directory not empty
 recovery-random-scale test_fail_client_mds: @@@@@@ FAIL: remove sub-test dirs failed 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:5829:error()
  = /usr/lib64/lustre/tests/test-framework.sh:5316:check_and_cleanup_lustre()
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="318620" author="egryaznova" created="Fri, 19 Nov 2021 10:16:49 +0000"  >&lt;p&gt;one more:&lt;br/&gt;
&lt;a href=&quot;https://testing.whamcloud.com/test_sets/03b6d0ca-04e6-4426-a4fc-a701404fc4f7&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/03b6d0ca-04e6-4426-a4fc-a701404fc4f7&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;https://testing.whamcloud.com/test_logs/ae021887-19cd-45a7-818a-07cfc12c0644/show_text&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_logs/ae021887-19cd-45a7-818a-07cfc12c0644/show_text&lt;/a&gt;&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
+ /usr/bin/lfs mkdir -i1 -c1 /mnt/lustre/d0.tar-onyx-72vm14.onyx.whamcloud.com
lfs mkdir: dirstripe error on &lt;span class=&quot;code-quote&quot;&gt;&apos;/mnt/lustre/d0.tar-onyx-72vm14.onyx.whamcloud.com&apos;&lt;/span&gt;: stripe already set
lfs setdirstripe: cannot create dir &lt;span class=&quot;code-quote&quot;&gt;&apos;/mnt/lustre/d0.tar-onyx-72vm14.onyx.whamcloud.com&apos;&lt;/span&gt;: File exists
+ &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; 1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i000yn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>