<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:05:49 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-315] system hang when running replay-single test_70b</title>
                <link>https://jira.whamcloud.com/browse/LU-315</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;system hang when running replay-single test_70b with quota enabled.&lt;/p&gt;

&lt;p&gt;----------&lt;del&gt;client-5 syslog&lt;/del&gt;---------&lt;br/&gt;
Lustre: DEBUG MARKER: == replay-single test 70b: mds recovery; 2 clients == 18:43:25 (1305251005)&lt;br/&gt;
Lustre: 31242:0:(debug.c:320:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.&lt;br/&gt;
Lustre: 31242:0:(debug.c:320:libcfs_debug_str2mask()) Skipped 1 previous similar message&lt;br/&gt;
Lustre: DEBUG MARKER: Started rundbench load pid=31244 ...&lt;br/&gt;
Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000&lt;br/&gt;
Lustre: DEBUG MARKER: test_70b fail mds1 1 times&lt;br/&gt;
Lustre: 22450:0:(import.c:529:import_select_connection()) lustre-MDT0000-mdc-ffff88022c570000: tried all connections, increasing latency to 21s&lt;br/&gt;
Lustre: 22450:0:(import.c:529:import_select_connection()) Skipped 15 previous similar messages&lt;br/&gt;
LustreError: 166-1: MGC192.168.4.128@o2ib: Connection to service MGS via nid 192.168.4.128@o2ib was lost; in progress operations using this service will fail.&lt;br/&gt;
LustreError: Skipped 4 previous similar messages&lt;br/&gt;
Lustre: 22449:0:(import.c:885:ptlrpc_connect_interpret()) MGS@192.168.4.128@o2ib changed server handle from 0xe7668c940984dcd4 to 0xe7668c9409873140&lt;br/&gt;
Lustre: MGC192.168.4.128@o2ib: Reactivating import&lt;br/&gt;
Lustre: Skipped 4 previous similar messages&lt;br/&gt;
Lustre: MGC192.168.4.128@o2ib: Connection restored to service MGS using nid 192.168.4.128@o2ib.&lt;br/&gt;
Lustre: Skipped 15 previous similar messages&lt;br/&gt;
LustreError: 22449:0:(client.c:2570:ptlrpc_replay_interpret()) @@@ status 301, old was 0  req@ffff8801de49d400 x1368643764028074/t519691043276(519691043276) o-1-&amp;gt;lustre-MDT0000_UUID@192.168.4.128@o2ib:12/10 lens 552/544 e 0 to 0 dl 1305251097 ref 2 fl Interpret:RP/ffffffff/ffffffff rc 301/-1&lt;br/&gt;
Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000&lt;br/&gt;
Lustre: DEBUG MARKER: test_70b fail mds1 2 times&lt;br/&gt;
Lustre: 22449:0:(import.c:885:ptlrpc_connect_interpret()) MGS@192.168.4.128@o2ib changed server handle from 0xe7668c9409873140 to 0xe7668c940987f1ea&lt;br/&gt;
Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000&lt;br/&gt;
Lustre: DEBUG MARKER: test_70b fail mds1 3 times&lt;br/&gt;
Lustre: 22449:0:(import.c:885:ptlrpc_connect_interpret()) MGS@192.168.4.128@o2ib changed server handle from 0xe7668c940987f1ea to 0xe7668c940988fe75&lt;br/&gt;
Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000&lt;br/&gt;
Lustre: DEBUG MARKER: test_70b fail mds1 4 times&lt;br/&gt;
LustreError: 11-0: an error occurred while communicating with 192.168.4.128@o2ib. The obd_ping operation failed with -107&lt;br/&gt;
LustreError: Skipped 14 previous similar messages&lt;br/&gt;
Lustre: 22449:0:(import.c:885:ptlrpc_connect_interpret()) MGS@192.168.4.128@o2ib changed server handle from 0xe7668c940988fe75 to 0xe7668c940a3306d6&lt;br/&gt;
Lustre: lustre-MDT0000-mdc-ffff88022c570000: Connection to service lustre-MDT0000 via nid 192.168.4.128@o2ib was lost; in progress operations using this service will wait for recovery to complete.&lt;br/&gt;
Lustre: Skipped 6 previous similar messages&lt;br/&gt;
LustreError: 22449:0:(client.c:2570:ptlrpc_replay_interpret()) @@@ status &lt;del&gt;116, old was 0  req@ffff880309152800 x1368643764039222/t532575946369(532575946369) o-1&lt;/del&gt;&amp;gt;lustre-MDT0000_UUID@192.168.4.128@o2ib:23/10 lens 360/424 e 0 to 0 dl 1305251472 ref 2 fl Interpret:R/ffffffff/ffffffff rc -116/-1&lt;br/&gt;
LustreError: 22449:0:(client.c:2570:ptlrpc_replay_interpret()) Skipped 10 previous similar messages&lt;br/&gt;
Lustre: 22448:0:(client.c:1775:ptlrpc_expire_one_request()) @@@ Request x1368643764472774 sent from lustre-MDT0000-mdc-ffff88022c570000 to NID 192.168.4.128@o2ib has timed out for slow reply: &lt;span class=&quot;error&quot;&gt;&amp;#91;sent 1305251440&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;real_sent 1305251440&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;current 1305251467&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;deadline 27s&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;delay 0s&amp;#93;&lt;/span&gt;  req@ffff8802ca0e5400 x1368643764472774/t0(0) o-1-&amp;gt;lustre-MDT0000_UUID@192.168.4.128@o2ib:12/10 lens 192/192 e 0 to 1 dl 1305251467 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1&lt;br/&gt;
Lustre: 22448:0:(client.c:1775:ptlrpc_expire_one_request()) Skipped 19 previous similar messages&lt;/p&gt;

&lt;p&gt;----------&lt;del&gt;fat-intel-2(ost)syslog&lt;/del&gt;-------&lt;br/&gt;
Lustre: DEBUG MARKER: == replay-single test 70b: mds recovery; 2 clients == 18:43:25 (1305251005)&lt;br/&gt;
Lustre: DEBUG MARKER: Started rundbench load pid=31244 ...&lt;br/&gt;
Lustre: DEBUG MARKER: test_70b fail mds1 1 times&lt;br/&gt;
Lustre: 17517:0:(client.c:1775:ptlrpc_expire_one_request()) @@@ Request x1368582162267244 sent from MGC192.168.4.128@o2ib to NID 192.168.4.128@o2ib has timed out for slow reply: &lt;span class=&quot;error&quot;&gt;&amp;#91;sent 1305251017&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;real_sent 1305251017&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;current 1305251024&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;deadline 7s&amp;#93;&lt;/span&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;delay 0s&amp;#93;&lt;/span&gt;  req@ffff880555595c00 x1368582162267244/t0(0) o-1-&amp;gt;MGS@MGC192.168.4.128@o2ib_0:26/25 lens 192/192 e 0 to 1 dl 1305251024 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1&lt;br/&gt;
Lustre: 17517:0:(client.c:1775:ptlrpc_expire_one_request()) Skipped 5 previous similar messages&lt;br/&gt;
LustreError: 166-1: MGC192.168.4.128@o2ib: Connection to service MGS via nid 192.168.4.128@o2ib was lost; in progress operations using this service will fail.&lt;br/&gt;
LustreError: Skipped 5 previous similar messages&lt;br/&gt;
Lustre: 1115:0:(ldlm_lib.c:800:target_handle_connect()) lustre-OST0000: received new MDS connection from NID 192.168.4.128@o2ib, removing former export from same NID&lt;br/&gt;
Lustre: 1115:0:(ldlm_lib.c:800:target_handle_connect()) Skipped 29 previous similar messages&lt;br/&gt;
Lustre: 1115:0:(ldlm_lib.c:871:target_handle_connect()) lustre-OST0000: connection from lustre-MDT0000-mdtlov_UUID@192.168.4.128@o2ib t0 exp (null) cur 1305251030 last 0&lt;br/&gt;
Lustre: 1115:0:(ldlm_lib.c:871:target_handle_connect()) Skipped 54 previous similar messages&lt;br/&gt;
Lustre: 1115:0:(filter.c:2806:filter_connect()) lustre-OST0000: Received MDS connection (0xc18765fcb6c04773); group 0&lt;br/&gt;
Lustre: 1115:0:(filter.c:2806:filter_connect()) Skipped 46 previous similar messages&lt;br/&gt;
Lustre: 1115:0:(sec.c:1474:sptlrpc_import_sec_adapt()) import lustre-OST0000-&amp;gt;NET_0x50000c0a80480_UUID netid 50000: select flavor null&lt;br/&gt;
Lustre: 1115:0:(sec.c:1474:sptlrpc_import_sec_adapt()) Skipped 57 previous similar messages&lt;br/&gt;
Lustre: 17519:0:(import.c:529:import_select_connection()) MGC192.168.4.128@o2ib: tried all connections, increasing latency to 6s&lt;br/&gt;
Lustre: 17519:0:(import.c:529:import_select_connection()) Skipped 1 previous similar message&lt;br/&gt;
Lustre: 17518:0:(import.c:885:ptlrpc_connect_interpret()) MGS@MGC192.168.4.128@o2ib_0 changed server handle from 0xe7668c940984dce2 to 0xe7668c9409873178&lt;br/&gt;
Lustre: MGC192.168.4.128@o2ib: Reactivating import&lt;br/&gt;
Lustre: Skipped 4 previous similar messages&lt;br/&gt;
Lustre: MGC192.168.4.128@o2ib: Connection restored to service MGS using nid 192.168.4.128@o2ib.&lt;br/&gt;
Lustre: Skipped 4 previous similar messages&lt;br/&gt;
Lustre: lustre-OST0001: received MDS connection from 192.168.4.128@o2ib&lt;br/&gt;
Lustre: lustre-OST0000: received MDS connection from 192.168.4.128@o2ib&lt;br/&gt;
Lustre: Skipped 29 previous similar messages&lt;br/&gt;
Lustre: 1117:0:(lustre_log.h:471:llog_group_set_export()) lustre-OST0004: export for group 0 is changed: 0xffff880283c42400 -&amp;gt; 0xffff8806058d4000&lt;br/&gt;
Lustre: 1113:0:(llog_net.c:168:llog_receptor_accept()) changing the import ffff8803254cb800 - ffff88061ba07000&lt;br/&gt;
Lustre: 1117:0:(lustre_log.h:471:llog_group_set_export()) Skipped 46 previous similar messages&lt;br/&gt;
Lustre: 1113:0:(llog_net.c:168:llog_receptor_accept()) Skipped 58 previous similar messages&lt;br/&gt;
Lustre: 1125:0:(filter.c:2510:filter_llog_connect()) lustre-OST0003: Recovery from log 0x21b165/0x0:b36580c9&lt;br/&gt;
Lustre: 1125:0:(filter.c:2510:filter_llog_connect()) Skipped 31 previous similar messages&lt;br/&gt;
Lustre: Skipped 2 previous similar messages&lt;br/&gt;
Lustre: DEBUG MARKER: test_70b fail mds1 2 times&lt;br/&gt;
Lustre: 17518:0:(import.c:885:ptlrpc_connect_interpret()) MGS@MGC192.168.4.128@o2ib_0 changed server handle from 0xe7668c9409873178 to 0xe7668c940987f276&lt;br/&gt;
Lustre: DEBUG MARKER: test_70b fail mds1 3 times&lt;br/&gt;
Lustre: 17518:0:(import.c:885:ptlrpc_connect_interpret()) MGS@MGC192.168.4.128@o2ib_0 changed server handle from 0xe7668c940987f276 to 0xe7668c940988fe28&lt;br/&gt;
Lustre: lustre-OST0002: haven&apos;t heard from client lustre-MDT0000-mdtlov_UUID (at 192.168.4.128@o2ib) in 54 seconds. I think it&apos;s dead, and I am evicting it. exp ffff88062d9b5000, cur 1305251420 expire 1305251390 last 1305251366&lt;br/&gt;
Lustre: lustre-OST0005: haven&apos;t heard from client lustre-MDT0000-mdtlov_UUID (at 192.168.4.128@o2ib) in 54 seconds. I think it&apos;s dead, and I am evicting it. exp ffff88061b45bc00, cur 1305251420 expire 1305251390 last 1305251366&lt;br/&gt;
Lustre: DEBUG MARKER: test_70b fail mds1 4 times&lt;br/&gt;
Lustre: 17518:0:(import.c:885:ptlrpc_connect_interpret()) MGS@MGC192.168.4.128@o2ib_0 changed server handle from 0xe7668c940988fe28 to 0xe7668c940a33069e&lt;/p&gt;</description>
                <environment>lustre-master/rhel6-x86_64/#114&lt;br/&gt;
client-5 is client, fat-intel-2 is ost</environment>
        <key id="10815">LU-315</key>
            <summary>system hang when running replay-single test_70b</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="niu">Niu Yawei</assignee>
                                    <reporter username="sarah">Sarah Liu</reporter>
                        <labels>
                    </labels>
                <created>Thu, 12 May 2011 19:20:36 +0000</created>
                <updated>Mon, 13 Jun 2011 14:22:57 +0000</updated>
                            <resolved>Mon, 13 Jun 2011 14:22:57 +0000</resolved>
                                    <version>Lustre 2.1.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>1</watches>
                                                                            <comments>
                            <comment id="14317" author="pjones" created="Fri, 13 May 2011 04:58:24 +0000"  >&lt;p&gt;Hi Niu&lt;/p&gt;

&lt;p&gt;As discussed can you please look into this one&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="14379" author="niu" created="Sun, 15 May 2011 20:43:20 +0000"  >&lt;p&gt;Hi, Sarah&lt;/p&gt;

&lt;p&gt;Which node hang? What&apos;s the syslog of MDS? Is it possible to dump stacktrace on the hang node? Thanks.&lt;/p&gt;</comment>
                            <comment id="14397" author="sarah" created="Mon, 16 May 2011 11:08:30 +0000"  >&lt;p&gt;sorry, I cannot reproduce this issue, will keep you update if I get any other information of this bug.&lt;/p&gt;</comment>
                            <comment id="16097" author="pjones" created="Mon, 13 Jun 2011 14:22:57 +0000"  >&lt;p&gt;Reopen if reoccurs&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw2p3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10519</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>