<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:47:00 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-11795] replay-vbr test 8b fails with &apos;Restart of mds1 failed!&apos;</title>
                <link>https://jira.whamcloud.com/browse/LU-11795</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;replay-vbr test_8b fails with &apos;Restart of mds1 failed!&apos;. So far, this test has only failed once; &lt;a href=&quot;https://testing.whamcloud.com/test_sets/4fca0808-fd1b-11e8-8512-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/4fca0808-fd1b-11e8-8512-52540065bddc&lt;/a&gt; . &lt;/p&gt;

&lt;p&gt;Looking at the client test_log, we see the MDS has problems&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;mount facets: mds1
CMD: trevis-16vm8 dmsetup status /dev/mapper/mds1_flakey &amp;gt;/dev/null 2&amp;gt;&amp;amp;1
CMD: trevis-16vm8 test -b /dev/lvm-Role_MDS/P1
CMD: trevis-16vm8 loop_dev=\$(losetup -j /dev/lvm-Role_MDS/P1 | cut -d : -f 1);
			 if [[ -z \$loop_dev ]]; then
				loop_dev=\$(losetup -f);
				losetup \$loop_dev /dev/lvm-Role_MDS/P1 || loop_dev=;
			 fi;
			 echo -n \$loop_dev
trevis-16vm8: losetup: /dev/lvm-Role_MDS/P1: failed to set up loop device: No such file or directory
CMD: trevis-16vm8 test -b /dev/lvm-Role_MDS/P1
CMD: trevis-16vm8 e2label /dev/lvm-Role_MDS/P1
trevis-16vm8: e2label: No such file or directory while trying to open /dev/lvm-Role_MDS/P1
trevis-16vm8: Couldn&apos;t find valid filesystem superblock.
Starting mds1:   -o loop /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1
CMD: trevis-16vm8 mkdir -p /mnt/lustre-mds1; mount -t lustre   -o loop /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1
trevis-16vm8: mount: /dev/lvm-Role_MDS/P1: failed to setup loop device: No such file or directory
Start of /dev/lvm-Role_MDS/P1 on mds1 failed 32
 replay-vbr test_8b: @@@@@@ FAIL: Restart of mds1 failed! 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Looking at the MDS1 (vm8) console log, we see replay-vbr test 8a start up, MDS1 disconnect, some stack traces (possibly for the replay-vbr test_8c hang) and the next Lustre test script output is for replay-single test 0a. Similar console log content for MDS2 (vm7).&lt;/p&gt;

&lt;p&gt;In the dmesg log for the OSS (vm5), we see some errors for test 8b&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[ 8057.763283] Lustre: DEBUG MARKER: == replay-vbr test 8b: create | unlink, create shouldn&apos;t fail ======================================== 17:11:00 (1544490660)
[ 8058.291385] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-16vm3: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 8058.489932] Lustre: DEBUG MARKER: trevis-16vm3: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 8067.329126] LNetError: 6975:0:(socklnd.c:1679:ksocknal_destroy_conn()) Completing partial receive from 12345-10.9.4.191@tcp[1], ip 10.9.4.191:7988, with error, wanted: 152, left: 152, last alive is 5 secs ago
[ 8067.330996] LustreError: 6975:0:(events.c:305:request_in_callback()) event type 2, status -5, service ost
[ 8067.331917] LustreError: 26344:0:(pack_generic.c:590:__lustre_unpack_msg()) message length 0 too small for magic/version check
[ 8067.332999] LustreError: 26344:0:(sec.c:2068:sptlrpc_svc_unwrap_request()) error unpacking request from 12345-10.9.4.191@tcp x1619515714571600
[ 8077.892741] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  replay-vbr test_8b: @@@@@@ FAIL: Restart of mds1 failed! 
[ 8078.081333] Lustre: DEBUG MARKER: replay-vbr test_8b: @@@@@@ FAIL: Restart of mds1 failed!
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment></environment>
        <key id="54322">LU-11795</key>
            <summary>replay-vbr test 8b fails with &apos;Restart of mds1 failed!&apos;</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="jamesanunez">James Nunez</reporter>
                        <labels>
                            <label>failover</label>
                    </labels>
                <created>Mon, 17 Dec 2018 18:00:01 +0000</created>
                <updated>Thu, 8 Oct 2020 03:03:29 +0000</updated>
                                            <version>Lustre 2.12.0</version>
                    <version>Lustre 2.13.0</version>
                    <version>Lustre 2.12.1</version>
                    <version>Lustre 2.12.4</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>1</watches>
                                                                            <comments>
                            <comment id="246376" author="jamesanunez" created="Thu, 25 Apr 2019 20:13:08 +0000"  >&lt;p&gt;We&apos;re seeing a very similar issue with not being able to start an OSS during recovery-mds-scale test_failover_ost; &lt;a href=&quot;https://testing.whamcloud.com/test_sets/a2fd473a-6632-11e9-bd0e-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/a2fd473a-6632-11e9-bd0e-52540065bddc&lt;/a&gt; . &lt;/p&gt;

&lt;p&gt;From the suite_log, we see&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;trevis-34vm6: losetup: /dev/lvm-Role_OSS/P1: failed to set up loop device: No such file or directory
CMD: trevis-34vm6 test -b /dev/lvm-Role_OSS/P1
CMD: trevis-34vm6 e2label /dev/lvm-Role_OSS/P1
trevis-34vm6: e2label: No such file or directory while trying to open /dev/lvm-Role_OSS/P1
trevis-34vm6: Couldn&apos;t find valid filesystem superblock.
Starting ost1:   -o loop /dev/lvm-Role_OSS/P1 /mnt/lustre-ost1
CMD: trevis-34vm6 mkdir -p /mnt/lustre-ost1; mount -t lustre   -o loop /dev/lvm-Role_OSS/P1 /mnt/lustre-ost1
trevis-34vm6: mount: /dev/lvm-Role_OSS/P1: failed to setup loop device: No such file or directory
Start of /dev/lvm-Role_OSS/P1 on ost1 failed 32
 recovery-mds-scale test_failover_ost: @@@@@@ FAIL: Restart of ost1 failed! 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="256581" author="jamesanunez" created="Thu, 17 Oct 2019 19:20:33 +0000"  >&lt;p&gt;We&apos;re seeing the same errors in non-failover testing as in all tests failing for performance-sanity; &lt;a href=&quot;https://testing.whamcloud.com/test_sets/c9195b96-e591-11e9-9874-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/c9195b96-e591-11e9-9874-52540065bddc&lt;/a&gt;.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="46849">LU-9707</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i0087b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>