<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:05:28 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-277] Test failure on test suite replay-single</title>
                <link>https://jira.whamcloud.com/browse/LU-277</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;This issue was created by maloo for sarah &amp;lt;sarah@whamcloud.com&amp;gt;&lt;/p&gt;

&lt;p&gt;This issue relates to the following test suite run: &lt;a href=&quot;https://maloo.whamcloud.com/test_sets/7a33c9b8-71c7-11e0-80b5-52540025f9af&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/test_sets/7a33c9b8-71c7-11e0-80b5-52540025f9af&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This one looks like lu-184 which has been fixed for a while. Actually I found the similar issue on replay-single and commented on &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-184&quot; title=&quot;Test failure on test suite insanity, subtest test_0&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-184&quot;&gt;&lt;del&gt;LU-184&lt;/del&gt;&lt;/a&gt; to make sure if it was the same problem. Here is the comment&apos;s link&lt;br/&gt;
&lt;a href=&quot;http://jira.whamcloud.com/browse/LU-184?focusedCommentId=12227&amp;amp;page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12227&quot; class=&quot;external-link&quot; rel=&quot;nofollow&quot;&gt;http://jira.whamcloud.com/browse/LU-184?focusedCommentId=12227&amp;amp;page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12227&lt;/a&gt; &lt;/p&gt;</description>
                <environment></environment>
        <key id="10741">LU-277</key>
            <summary>Test failure on test suite replay-single</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="niu">Niu Yawei</assignee>
                                    <reporter username="maloo">Maloo</reporter>
                        <labels>
                    </labels>
                <created>Wed, 4 May 2011 14:30:08 +0000</created>
                <updated>Mon, 13 Jun 2011 14:21:57 +0000</updated>
                            <resolved>Mon, 13 Jun 2011 14:21:57 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>1</watches>
                                                                            <comments>
                            <comment id="13700" author="pjones" created="Wed, 4 May 2011 14:52:39 +0000"  >&lt;p&gt;Niu&lt;/p&gt;

&lt;p&gt;As you worked previously on LU184 could you please comment?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="13723" author="niu" created="Thu, 5 May 2011 00:54:20 +0000"  >&lt;p&gt;The failure is caused by the open replay from client-6-ib and client-21-ib. Actually, these two clients should not be involved in this test (our intention is to test client-23-ib), however, there were lots of open replay reqeusts kept on the other two clients, these open replays participated in the recovery, and result in test failure at the end.&lt;/p&gt;

&lt;p&gt;I guess these open replays come from previous tests, maybe runracer. (runracer is before replay_single? Sarah, please correct me if I&apos;m wrong) Will look into the runracer test to see if there is anything wrong in the script.&lt;/p&gt;

&lt;p&gt;How can the open replay fail? One possible reason occur to me: we often set the MDS as read only in replay_single tests, so the open_create will not commit into disk sometimes, and in the test_20b, client-23-ib was evicted by MDS, so some open_create replay from this client will be lost, and the open replays to the same file from other two clients will fail for EONENT. &lt;/p&gt;</comment>
                            <comment id="13972" author="niu" created="Sun, 8 May 2011 20:50:23 +0000"  >&lt;p&gt;As shown on the maloo system, runracer is just ran before this replay-single test: This replay-single test started on 2011-04-28 11:25:09 UTC, and a runracer started on 2011-04-28 11:21:52 UTC and last 197 seconds.&lt;/p&gt;

&lt;p&gt;So I highly suspect that there is something wrong in the runracer script, which caused some racer test threads was not termninated properly. (as I mentioned in previous comment, since replay-single test often run replay barrier, these unexpected racer test thread could result in recovery failure)&lt;/p&gt;

&lt;p&gt;The following line of runracer confused me:&lt;br/&gt;
running=$(do_nodes $clients &quot;ps uax | grep $RDIR &quot; | egrep -v &quot;(acceptance|grep|pdsh|bash)&quot; || true)&lt;br/&gt;
I don&apos;t see why &apos;acceptance&apos; should be matched, it might be the culprit.&lt;/p&gt;

&lt;p&gt;Hi, Sarah&lt;br/&gt;
Since I don&apos;t have any reserved nodes, could you help me to do following test?&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Run runracer with auster. (on two or three clients, client-23-ib, client-21-ib and client-6-ib for instance)&lt;/li&gt;
	&lt;li&gt;While test is running, get the output of &quot;do_nodes $clients &quot;ps uax | grep $RDIR &quot;&lt;/li&gt;
	&lt;li&gt;While test is running, get the output of &quot;do_nodes $clients &quot;ps uax | grep $RDIR &quot; | egrep -v &quot;(acceptance|grep|pdsh|bash)&quot;&lt;br/&gt;
I want to see if the criteria of finding racer test threads is still valid for auster. Thanks.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="14077" author="niu" created="Tue, 10 May 2011 00:35:46 +0000"  >&lt;p&gt;Well, I finally get 3 nodes on Toro today, and after several runs of runracer, I found that pdsh often return errors like:&lt;/p&gt;

&lt;p&gt;pdsh@client-16-ib: client-17: read: protocol failure: Connection reset by peer&lt;br/&gt;
or&lt;br/&gt;
pdsh@client-16-ib: client-17: rcmd: xpoll (setting up stderr): Interrupted system call&lt;/p&gt;

&lt;p&gt;so when such error happens, the script will incorrectly think there isn&apos;t racer threads running by following check:&lt;br/&gt;
running=$(do_nodes $clients &quot;ps uax | grep $RDIR &quot; | egrep -v &quot;(acceptance|grep|pdsh|bash)&quot; || true)&lt;/p&gt;

&lt;p&gt;Will try to come up with a patch to deal with the pdsh errors in the script.&lt;/p&gt;</comment>
                            <comment id="14145" author="niu" created="Wed, 11 May 2011 02:38:43 +0000"  >&lt;p&gt;I tried to reproduce this bug with running &quot;runracer + replay-single 20&quot; over three clients many times, but it never hit it, so I can&apos;t make sure if the open-replays come from runracer.&lt;/p&gt;

&lt;p&gt;Hi Sarah, how often did you encounter this bug? If there isn&apos;t any reproducer, and it&apos;s only be seen very few times, I suggest we leave hold the investigating on it until it becomes a real issue. Thanks.&lt;/p&gt;</comment>
                            <comment id="16095" author="pjones" created="Mon, 13 Jun 2011 14:21:57 +0000"  >&lt;p&gt;Reopen if reoccurs&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw17j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10277</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>