<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:17:49 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-8467] MDS crashed with (tgt_lastrcvd.c:1054:tgt_client_del()) LBUG</title>
                <link>https://jira.whamcloud.com/browse/LU-8467</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Error happened during soaktesting of build &apos;20160727&apos; (see &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160727&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160727&lt;/a&gt;)&lt;br/&gt;
OSTs formatted with zfs, MDSs formatted with ldiskfs&lt;br/&gt;
DNE is enabled, HSM/robinhood enable and integrated&lt;br/&gt;
4 MDSs with 1 MDT / MDS&lt;br/&gt;
6 OSSs with 4 OSTs / OSS&lt;br/&gt;
Server nodes configured in active-active HA confguration&lt;br/&gt;
(Nodes &lt;tt&gt;lola-&lt;span class=&quot;error&quot;&gt;&amp;#91;8,9&amp;#93;&lt;/span&gt;&lt;/tt&gt; from a failover cluster)&lt;/p&gt;

&lt;p&gt;The issue is eventually a duplicate of &lt;a href=&quot;https://jira.hpdd.intel.com/browse/LU-8165&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://jira.hpdd.intel.com/browse/LU-8165&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sequence of events:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;2016-08-01 09:20:35,183:fsmgmt.fsmgmt:INFO     triggering fault mds_failover&lt;/li&gt;
	&lt;li&gt;2016-08-01 09:27:04,811:fsmgmt.fsmgmt:INFO     lola-8 is up!!!&lt;/li&gt;
	&lt;li&gt;2016-08-01 09:27:15,825:fsmgmt.fsmgmt:INFO     started mount of MDT0000 on lola-9&lt;/li&gt;
	&lt;li&gt;2016-08-01 10:04:00                           During mount of MDT on secondary node the (secondary) MDS crashed with kernel panic:
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;ds. I think it&apos;s dead, and I am evicting it. exp ffff88081e31e800, cur 1470071025 expire 1470070875 last 1470070794
&amp;lt;3&amp;gt;LustreError: 6208:0:(tgt_lastrcvd.c:1053:tgt_client_del()) soaked-MDT0001: client 4294967295: bit already clear in bitmap!!
&amp;lt;0&amp;gt;LustreError: 6208:0:(tgt_lastrcvd.c:1054:tgt_client_del()) LBUG
&amp;lt;4&amp;gt;Pid: 6208, comm: ll_evictor
&amp;lt;4&amp;gt;
&amp;lt;4&amp;gt;Call Trace:
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa07fc875&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa07fce77&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0b6cb62&amp;gt;] tgt_client_del+0x5f2/0x600 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa11f4d8e&amp;gt;] mdt_obd_disconnect+0x48e/0x570 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa08e538d&amp;gt;] class_fail_export+0x23d/0x530 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0b23485&amp;gt;] ping_evictor_main+0x245/0x650 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffff81067650&amp;gt;] ? default_wake_function+0x0/0x20
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0b23240&amp;gt;] ? ping_evictor_main+0x0/0x650 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffff810a138e&amp;gt;] kthread+0x9e/0xc0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c28a&amp;gt;] child_rip+0xa/0x20
&amp;lt;4&amp;gt; [&amp;lt;ffffffff810a12f0&amp;gt;] ? kthread+0x0/0xc0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c280&amp;gt;] ? child_rip+0x0/0x20
&amp;lt;4&amp;gt;
&amp;lt;0&amp;gt;Kernel panic - not syncing: LBUG
&amp;lt;4&amp;gt;Pid: 6208, comm: ll_evictor Tainted: P           -- ------------    2.6.32-573.26.1.el6_lustre.x86_64 #1
&amp;lt;4&amp;gt;Call Trace:
&amp;lt;4&amp;gt; [&amp;lt;ffffffff81539407&amp;gt;] ? panic+0xa7/0x16f
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa07fcecb&amp;gt;] ? lbug_with_loc+0x9b/0xb0 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0b6cb62&amp;gt;] ? tgt_client_del+0x5f2/0x600 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa11f4d8e&amp;gt;] ? mdt_obd_disconnect+0x48e/0x570 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa08e538d&amp;gt;] ? class_fail_export+0x23d/0x530 [obdclass]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0b23485&amp;gt;] ? ping_evictor_main+0x245/0x650 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffff81067650&amp;gt;] ? default_wake_function+0x0/0x20
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0b23240&amp;gt;] ? ping_evictor_main+0x0/0x650 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffff810a138e&amp;gt;] ? kthread+0x9e/0xc0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c28a&amp;gt;] ? child_rip+0xa/0x20
&amp;lt;4&amp;gt; [&amp;lt;ffffffff810a12f0&amp;gt;] ? kthread+0x0/0xc0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c280&amp;gt;] ? child_rip+0x0/0x20
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I couldn&apos;t extract the debug from kernel dump&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;      KERNEL: usr/lib/debug/lib/modules/2.6.32-573.26.1.el6_lustre.x86_64/vmlinux
    DUMPFILE: 127.0.0.1-2016-08-01-10:04:00/vmcore  [PARTIAL DUMP]
        CPUS: 32
        DATE: Mon Aug  1 10:03:45 2016
      UPTIME: 2 days, 19:40:09
LOAD AVERAGE: 16.98, 16.25, 16.97
       TASKS: 1536
    NODENAME: lola-9.lola.whamcloud.com
     RELEASE: 2.6.32-573.26.1.el6_lustre.x86_64
     VERSION: #1 SMP Tue Jul 26 04:04:13 PDT 2016
     MACHINE: x86_64  (2693 Mhz)
      MEMORY: 31.9 GB
       PANIC: &quot;Kernel panic - not syncing: LBUG&quot;
         PID: 6208
     COMMAND: &quot;ll_evictor&quot;
        TASK: ffff880413fe2040  [THREAD_INFO: ffff880413fec000]
         CPU: 25
       STATE: TASK_RUNNING (PANIC)

crash&amp;gt; extend /scratch/crash_lustre/lustre.so
/scratch/crash_lustre/lustre.so: shared object loaded

crash&amp;gt; lustre -l /scratch/lola-9-latest-crash.bin
lustre_walk_cpus(0, 5, 1)
cmd p (*cfs_trace_data[0])[0].tcd.tcd_cur_pages // p (*cfs_trace_data[0])[0].tcd.tcd_pages.next 
lustre: gdb request failed: &quot;p (*cfs_trace_data[0])[0].tcd.tcd_cur_pages&quot;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Attached files:&lt;br/&gt;
message, console, vmcore-dmesg.txt of node &lt;tt&gt;lola-9&lt;/tt&gt;.&lt;/p&gt;</description>
                <environment>lola&lt;br/&gt;
build: tip of master, commit 0f37c051158a399f7b00536eeec27f5dbdd54168</environment>
        <key id="38567">LU-8467</key>
            <summary>MDS crashed with (tgt_lastrcvd.c:1054:tgt_client_del()) LBUG</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="tappro">Mikhail Pershin</assignee>
                                    <reporter username="heckes">Frank Heckes</reporter>
                        <labels>
                            <label>soak</label>
                    </labels>
                <created>Tue, 2 Aug 2016 10:50:28 +0000</created>
                <updated>Mon, 17 Oct 2016 19:39:06 +0000</updated>
                            <resolved>Mon, 17 Oct 2016 19:39:05 +0000</resolved>
                                                    <fixVersion>Lustre 2.9.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="160509" author="heckes" created="Tue, 2 Aug 2016 10:54:43 +0000"  >&lt;p&gt;Crash file has been saved to &lt;tt&gt;lhn.lola.hpdd.intel.com:/scratch/crashdumps/lu-8467/lola-9/127.0.0.1-2016-08-01-10\:04\:00/&lt;/tt&gt;&lt;/p&gt;</comment>
                            <comment id="165592" author="pjones" created="Sat, 10 Sep 2016 13:18:21 +0000"  >&lt;p&gt;Does this issue still occur now &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8165&quot; title=&quot;(tgt_lastrcvd.c:656:tgt_client_del()) lustre-OST0000: client 4294967295: bit already clear in bitmap!!&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8165&quot;&gt;&lt;del&gt;LU-8165&lt;/del&gt;&lt;/a&gt; has landed?&lt;/p&gt;</comment>
                            <comment id="169998" author="cliffw" created="Mon, 17 Oct 2016 19:18:23 +0000"  >&lt;p&gt;We have not had a re-appearance of this issue since running tip of 2.9, continuing to test&lt;/p&gt;</comment>
                            <comment id="170000" author="pjones" created="Mon, 17 Oct 2016 19:39:06 +0000"  >&lt;p&gt;ok then let&apos;s close out the ticket for now and reopen if it ever does reoccur&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="22441" name="console-lola-9.log.bz2" size="47137" author="heckes" created="Tue, 2 Aug 2016 10:59:44 +0000"/>
                            <attachment id="22440" name="messages-lola-9.log.bz2" size="141162" author="heckes" created="Tue, 2 Aug 2016 10:59:44 +0000"/>
                            <attachment id="22442" name="vmcore-dmesg.txt.bz2" size="29473" author="heckes" created="Tue, 2 Aug 2016 10:59:44 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzyj9z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>