<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:00:17 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6444] Hard Failover recovery-mds-scale test_failover_ost: test_failover_ost returned 3</title>
                <link>https://jira.whamcloud.com/browse/LU-6444</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;This issue was created by maloo for sarah &amp;lt;sarah@whamcloud.com&amp;gt;&lt;/p&gt;

&lt;p&gt;This issue relates to the following test suite run: &lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/83ac0620-d67e-11e4-8a24-5254006e85c2&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/83ac0620-d67e-11e4-8a24-5254006e85c2&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The sub-test test_failover_ost failed with the following error:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;test_failover_ost returned 3
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;MDS console&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Lustre: lustre-MDT0000: Recovery over after 0:09, of 2 clients 2 recovered and 0 were evicted.
Lustre: 2817:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1427510311/real 1427510311]  req@ffff88006b6859c0 x1496853024604548/t0(0) o8-&amp;gt;lustre-OST0002-osc-MDT0000@10.2.4.226@tcp:28/4 lens 400/544 e 0 to 1 dl 1427510316 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 2817:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1427510321/real 1427510321]  req@ffff8800722326c0 x1496853024604620/t0(0) o8-&amp;gt;lustre-OST0006-osc-MDT0000@10.2.4.226@tcp:28/4 lens 400/544 e 0 to 1 dl 1427510331 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 2817:0:(client.c:1939:ptlrpc_expire_one_request()) Skipped 2 previous similar messages
LNet: Service thread pid 3181 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
Pid: 3181, comm: mdt00_003

Call Trace:
 [&amp;lt;ffffffff8152b162&amp;gt;] schedule_timeout+0x192/0x2e0
 [&amp;lt;ffffffff810874f0&amp;gt;] ? process_timeout+0x0/0x10
 [&amp;lt;ffffffffa1325951&amp;gt;] osp_precreate_reserve+0x3d1/0x810 [osp]
 [&amp;lt;ffffffffa131b740&amp;gt;] ? osp_object_free+0x2d0/0x4a0 [osp]
 [&amp;lt;ffffffff81064b90&amp;gt;] ? default_wake_function+0x0/0x20
 [&amp;lt;ffffffffa131fbb6&amp;gt;] osp_declare_object_create+0x1a6/0x650 [osp]
 [&amp;lt;ffffffffa125d81a&amp;gt;] lod_qos_declare_object_on+0x12a/0x4f0 [lod]
 [&amp;lt;ffffffffa12603f3&amp;gt;] lod_alloc_rr.clone.2+0xbb3/0x1020 [lod]
 [&amp;lt;ffffffffa079dfb1&amp;gt;] ? libcfs_debug_msg+0x41/0x50 [libcfs]
 [&amp;lt;ffffffffa1261442&amp;gt;] lod_qos_prep_create+0xbe2/0x19e0 [lod]
 [&amp;lt;ffffffffa1254d42&amp;gt;] lod_declare_striped_object+0x162/0x9b0 [lod]
 [&amp;lt;ffffffffa125c547&amp;gt;] lod_declare_object_create+0x2c7/0x460 [lod]
 [&amp;lt;ffffffffa12c4506&amp;gt;] mdd_declare_object_create_internal+0x116/0x340 [mdd]
 [&amp;lt;ffffffffa12bd79e&amp;gt;] mdd_create+0x68e/0x1730 [mdd]
 [&amp;lt;ffffffffa118f7b8&amp;gt;] mdo_create+0x18/0x50 [mdt]
 [&amp;lt;ffffffffa1196bbf&amp;gt;] mdt_reint_open+0x1f8f/0x2c70 [mdt]
 [&amp;lt;ffffffffa0906cbc&amp;gt;] ? upcall_cache_get_entry+0x29c/0x880 [obdclass]
 [&amp;lt;ffffffffa1180cad&amp;gt;] mdt_reint_rec+0x5d/0x200 [mdt]
 [&amp;lt;ffffffffa116511b&amp;gt;] mdt_reint_internal+0x4cb/0x7a0 [mdt]
 [&amp;lt;ffffffffa11655e6&amp;gt;] mdt_intent_reint+0x1f6/0x430 [mdt]
 [&amp;lt;ffffffffa1163bd4&amp;gt;] mdt_intent_policy+0x494/0xce0 [mdt]
 [&amp;lt;ffffffffa0abf4f9&amp;gt;] ldlm_lock_enqueue+0x129/0x9d0 [ptlrpc]
 [&amp;lt;ffffffffa0aeb52b&amp;gt;] ldlm_handle_enqueue0+0x51b/0x13f0 [ptlrpc]
 [&amp;lt;ffffffffa0b6bbb1&amp;gt;] tgt_enqueue+0x61/0x230 [ptlrpc]
 [&amp;lt;ffffffffa0b6c7fe&amp;gt;] tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [&amp;lt;ffffffffa0b1c661&amp;gt;] ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [&amp;lt;ffffffffa0b1b820&amp;gt;] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [&amp;lt;ffffffff8109e66e&amp;gt;] kthread+0x9e/0xc0
 [&amp;lt;ffffffff8100c20a&amp;gt;] child_rip+0xa/0x20
 [&amp;lt;ffffffff8109e5d0&amp;gt;] ? kthread+0x0/0xc0
 [&amp;lt;ffffffff8100c200&amp;gt;] ? child_rip+0x0/0x20

LustreError: dumping log to /tmp/lustre-log.1427510345.3181
LNet: Service thread pid 3181 completed after 80.04s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Duration:               86400
Server failover period: 1200 seconds
Exited after:           37 seconds
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>client and server: lustre-master build #2967 zfs</environment>
        <key id="29442">LU-6444</key>
            <summary>Hard Failover recovery-mds-scale test_failover_ost: test_failover_ost returned 3</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="6" iconUrl="https://jira.whamcloud.com/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="maloo">Maloo</reporter>
                        <labels>
                    </labels>
                <created>Wed, 8 Apr 2015 18:42:37 +0000</created>
                <updated>Thu, 2 Jul 2015 22:54:40 +0000</updated>
                            <resolved>Fri, 12 Jun 2015 00:15:39 +0000</resolved>
                                    <version>Lustre 2.8.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="111851" author="green" created="Thu, 9 Apr 2015 17:08:30 +0000"  >&lt;p&gt;So I see that OST also has timeouts waiting for transaction to commit.&lt;br/&gt;
Unfortunately no console logs for MDS to correlate the two records, but could be a slow zfs ost causing havoc here?&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Lustre: Skipped 2 previous similar messages
LNet: Service thread pid 11335 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
Pid: 11335, comm: ll_ost00_003

Call Trace:
 [&amp;lt;ffffffffa0fe3860&amp;gt;] ? osd_write+0x220/0x390 [osd_zfs]
 [&amp;lt;ffffffff8109ed4e&amp;gt;] ? prepare_to_wait_exclusive+0x4e/0x80
 [&amp;lt;ffffffffa01455f5&amp;gt;] cv_wait_common+0x105/0x120 [spl]
 [&amp;lt;ffffffff8109eb00&amp;gt;] ? autoremove_wake_function+0x0/0x40
 [&amp;lt;ffffffffa0145665&amp;gt;] __cv_wait+0x15/0x20 [spl]
 [&amp;lt;ffffffffa024824b&amp;gt;] txg_wait_synced+0x7b/0xb0 [zfs]
 [&amp;lt;ffffffffa0fd4f35&amp;gt;] osd_trans_stop+0x415/0x4d0 [osd_zfs]
 [&amp;lt;ffffffffa1120f5f&amp;gt;] ofd_trans_stop+0x1f/0x60 [ofd]
 [&amp;lt;ffffffffa11256c4&amp;gt;] ofd_attr_set+0x304/0x7f0 [ofd]
 [&amp;lt;ffffffffa1112944&amp;gt;] ofd_setattr_hdl+0x1b4/0x9d0 [ofd]
 [&amp;lt;ffffffffa0b6c7fe&amp;gt;] tgt_request_handle+0x8be/0x1000 [ptlrpc]
 [&amp;lt;ffffffffa0b1c661&amp;gt;] ptlrpc_main+0xe41/0x1960 [ptlrpc]
 [&amp;lt;ffffffffa0b1b820&amp;gt;] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
 [&amp;lt;ffffffff8109e66e&amp;gt;] kthread+0x9e/0xc0
 [&amp;lt;ffffffff8100c20a&amp;gt;] child_rip+0xa/0x20
 [&amp;lt;ffffffff8109e5d0&amp;gt;] ? kthread+0x0/0xc0
 [&amp;lt;ffffffff8100c200&amp;gt;] ? child_rip+0x0/0x20
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="117106" author="sarah" created="Mon, 1 Jun 2015 20:55:58 +0000"  >&lt;p&gt;also seen with ldiskfs&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/f8515fe4-007b-11e5-9650-5254006e85c2&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/f8515fe4-007b-11e5-9650-5254006e85c2&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="118309" author="sarah" created="Fri, 12 Jun 2015 00:15:39 +0000"  >&lt;p&gt;dup of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6669&quot; title=&quot;Hard Failover recovery-mds-scale test_failover_mds: test_failover_mds returned 3&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6669&quot;&gt;&lt;del&gt;LU-6669&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxaev:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>