<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:14:31 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-8086] client eviction after MDT restart or failover</title>
                <link>https://jira.whamcloud.com/browse/LU-8086</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The error happens during soak testing of build &apos;20160427&apos; (see &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160427&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160427&lt;/a&gt;). DNE is enabled. OSTs had been formatted with &lt;em&gt;zfs&lt;/em&gt;, MDT&apos;s using &lt;em&gt;ldiskfs&lt;/em&gt; as storage backend. OSS and MDT nodes are configured in HA active-active failover configuration. For debugging purpose parameter &lt;tt&gt;dump_on_eviction=1&lt;/tt&gt; was set.&lt;/p&gt;

&lt;p&gt;The configuration, especially the mapping of node node to role can be found here: &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-Configuration&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-Configuration&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After &lt;b&gt;every&lt;/b&gt; MDS restart or failover a large number of Luster nodes (very often the majority)  are evicted.&lt;/p&gt;

&lt;p&gt;The following sequence of events is 100% reproducible:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;2016-04-28 10:07:15,956   mds_failover lola-8 ---&amp;gt; lola-9 started&lt;/li&gt;
	&lt;li&gt;2016-04-28 10:16:23,738:fsmgmt.fsmgmt:INFO     Node lola-9: &apos;soaked-MDT0000&apos; recovery completed&lt;/li&gt;
	&lt;li&gt;2016-04-28 10:16:23,739:fsmgmt.fsmgmt:INFO     Unmounting soaked-MDT0000 on lola-9 ...&lt;/li&gt;
	&lt;li&gt;2016-04-28 10:16:48,995:fsmgmt.fsmgmt:INFO     ... soaked-MDT0000 mounted successfully on lola-8&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;2016-04-28 10:16:48,996     mds_failover  (failback completed ; lola-8 run own own resource mdt-0 again)&lt;/p&gt;

&lt;p&gt;2016-04-28 10:17:32    recovery of mdt-0 finished on lola-8:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Apr 28 10:17:32 lola-8 kernel: Lustre: soaked-MDT0000: Recovery over after 0:43, of 21 clients 21 recovered and 0 were evicted.
* 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;ul&gt;
	&lt;li&gt;2016-04-28 10:17:*   most clients get evicted although stated differetly in Lustre message above:
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lola-10.log:Apr 28 10:17:24 lola-10 kernel: LustreError: 48860:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-11.log:Apr 28 10:17:05 lola-11 kernel: LustreError: 28261:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-13.log:Apr 28 10:17:06 lola-13 kernel: LustreError: 81063:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-16.log:Apr 28 10:17:10 lola-16 kernel: LustreError: 229277:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-18.log:Apr 28 10:17:08 lola-18 kernel: LustreError: 110914:0:(import.c:1405:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-19.log:Apr 28 10:17:25 lola-19 kernel: LustreError: 233525:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-20.log:Apr 28 10:17:14 lola-20 kernel: LustreError: 182741:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-21.log:Apr 28 10:17:14 lola-21 kernel: LustreError: 155091:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-22.log:Apr 28 10:17:05 lola-22 kernel: LustreError: 171992:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-23.log:Apr 28 10:17:34 lola-23 kernel: LustreError: 158263:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-24.log:Apr 28 10:17:21 lola-24 kernel: LustreError: 160657:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-25.log:Apr 28 10:17:11 lola-25 kernel: LustreError: 196242:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-26.log:Apr 28 10:17:07 lola-26 kernel: LustreError: 153478:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-27.log:Apr 28 10:17:20 lola-27 kernel: LustreError: 158888:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-29.log:Apr 28 10:17:25 lola-29 kernel: LustreError: 29326:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-2.log:Apr 28 10:17:10 lola-2 kernel: LustreError: 16891:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-2.log:Apr 28 10:17:17 lola-2 kernel: LustreError: 16899:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-2.log:Apr 28 10:17:42 lola-2 kernel: LustreError: 16907:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-30.log:Apr 28 10:17:14 lola-30 kernel: LustreError: 34608:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-31.log:Apr 28 10:17:21 lola-31 kernel: LustreError: 17749:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-32.log:Apr 28 10:17:02 lola-32 kernel: LustreError: 152914:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-33.log:Apr 28 10:17:14 lola-33 kernel: LustreError: 165946:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-34.log:Apr 28 10:17:16 lola-34 kernel: LustreError: 152469:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-3.log:Apr 28 10:17:18 lola-3 kernel: LustreError: 75334:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-4.log:Apr 28 10:17:08 lola-4 kernel: LustreError: 34658:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-5.log:Apr 28 10:17:07 lola-5 kernel: LustreError: 32477:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-6.log:Apr 28 10:17:02 lola-6 kernel: LustreError: 75888:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-7.log:Apr 28 10:17:24 lola-7 kernel: LustreError: 20063:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
lola-9.log:Apr 28 10:17:31 lola-9 kernel: LustreError: 11783:0:(import.c:1406:ptlrpc_invalidate_import_thread()) dump the log upon eviction
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Attached files messages, console, and debug log for each Lustre node type:&lt;br/&gt;
OSS  : lola-3  &lt;br/&gt;
MDS :  lola-11&lt;br/&gt;
client : lola-20&lt;/p&gt;

&lt;p&gt;As stated above the effect can be reproduced with certainty in case additional information are needed.&lt;/p&gt;

&lt;p&gt;IB fabric and LNet routers didn&apos;t indicate any errors or malfunctions at any of the time interval the error occurred, nor earlier or later.&lt;/p&gt;</description>
                <environment>lola&lt;br/&gt;
build: master commit 71d2ea0fde17ecde0bf237f486d4bafb5d54fe3f + patches</environment>
        <key id="36467">LU-8086</key>
            <summary>client eviction after MDT restart or failover</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="6">Not a Bug</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="heckes">Frank Heckes</reporter>
                        <labels>
                            <label>soak</label>
                    </labels>
                <created>Fri, 29 Apr 2016 14:01:58 +0000</created>
                <updated>Wed, 13 Oct 2021 02:34:06 +0000</updated>
                            <resolved>Wed, 13 Oct 2021 02:34:06 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="150723" author="di.wang" created="Mon, 2 May 2016 17:22:36 +0000"  >&lt;p&gt;It seems most of the eviction happened between mgc and mgs in lola-20-lustre-log.1461863720.182550, which is normal in this test.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;ptlrpc_invalidate_import_thread^@dump the log upon eviction
ptlrpc_invalidate_import_thread^@ffff880821fd8000 MGS: changing import state from EVICTED to RECOVER
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="21327" name="console-lola-11.bz2" size="402471" author="heckes" created="Fri, 29 Apr 2016 15:53:39 +0000"/>
                            <attachment id="21328" name="console-lola-20.bz2" size="399712" author="heckes" created="Fri, 29 Apr 2016 15:53:39 +0000"/>
                            <attachment id="21326" name="console-lola-3.bz2" size="382206" author="heckes" created="Fri, 29 Apr 2016 15:53:38 +0000"/>
                            <attachment id="21336" name="lola-11-lustre-log.1461863436.19099.bz2" size="8922783" author="heckes" created="Fri, 29 Apr 2016 16:10:02 +0000"/>
                            <attachment id="21337" name="lola-11-lustre-log.1461863718.28215.bz2" size="5676" author="heckes" created="Fri, 29 Apr 2016 16:10:02 +0000"/>
                            <attachment id="21335" name="lola-11-lustre-log.1461863825.28262.bz2" size="1603671" author="heckes" created="Fri, 29 Apr 2016 16:02:55 +0000"/>
                            <attachment id="21332" name="lola-20-lustre-log.1461863720.182550.bz2" size="4316523" author="heckes" created="Fri, 29 Apr 2016 15:56:50 +0000"/>
                            <attachment id="21333" name="lola-3-lustre-log.1461863731.75227.bz2" size="6504046" author="heckes" created="Fri, 29 Apr 2016 16:01:49 +0000"/>
                            <attachment id="21334" name="lola-3-lustre-log.1461863838.75334.bz2" size="625033" author="heckes" created="Fri, 29 Apr 2016 16:01:49 +0000"/>
                            <attachment id="21330" name="messages-lola-11.bz2" size="227523" author="heckes" created="Fri, 29 Apr 2016 15:54:20 +0000"/>
                            <attachment id="21331" name="messages-lola-20.bz2" size="399591" author="heckes" created="Fri, 29 Apr 2016 15:54:20 +0000"/>
                            <attachment id="21329" name="messages-lola-3.bz2" size="133444" author="heckes" created="Fri, 29 Apr 2016 15:54:20 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzy9rj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>