<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:12:06 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-7809] general protection fault: 0000 during failback of MDS disk resources</title>
                <link>https://jira.whamcloud.com/browse/LU-7809</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Error happens during soak testing of build &apos;20160222&apos;  (see &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20150222&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20150222&lt;/a&gt;) DNE is enabled. MDTs had been formatted using &lt;em&gt;ldiskfs&lt;/em&gt;, OST using &lt;em&gt;zfs&lt;/em&gt;. MDSes are configured in active-active HA failover configuration.Especially nodes affected (&lt;tt&gt;lola-&lt;span class=&quot;error&quot;&gt;&amp;#91;8,9&amp;#93;&lt;/span&gt;&lt;/tt&gt;) form a HA failover pair.&lt;br/&gt;
More set-up details can be found at &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-Configuration&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-Configuration&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sequence of events&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;2016-02-23 23:52:17,963:fsmgmt.fsmgmt:INFO     triggering fault mds_failover&lt;/li&gt;
	&lt;li&gt;2016-02-23 23:52:17,964:fsmgmt.fsmgmt:INFO     reseting MDS node lola-9&lt;/li&gt;
	&lt;li&gt;2016-02-24 00:00:29   Both MDTs (mdt-2,3) failover to lola-8&lt;/li&gt;
	&lt;li&gt;2016-02-24 00:01:06,468:fsmgmt.fsmgmt:INFO     ... soaked-MDT0003 failed back      (action completed successful!)&lt;/li&gt;
	&lt;li&gt;2016-02-24 00:01:06,468:fsmgmt.fsmgmt:INFO     Unmounting soaked-MDT0002 on lola-8 ..        (--&amp;gt; caused crash)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The error reads as:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;&amp;lt;4&amp;gt;general protection fault: 0000 [#1] 
&amp;lt;3&amp;gt;LustreError: 6683:0:(ldlm_lib.c:2562:target_stop_recovery_thread()) soaked-MDT0002: Aborting recovery
&amp;lt;4&amp;gt;SMP 
&amp;lt;4&amp;gt;last sysfs file: /sys/devices/system/cpu/online
&amp;lt;4&amp;gt;CPU 12 
&amp;lt;4&amp;gt;Modules linked in: mgs(U) osp(U) mdd(U) lod(U) mdt(U) lfsck(U) mgc(U) osd_ldiskfs(U) ldiskfs(U) jbd2 lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic crc32c_intel libcfs(U) 8021q garp stp llc nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm scsi_dh_rdac dm_round_robin dm_multipath iTCO_wdt iTCO_vendor_support microcode zfs(P)(U) zcommon(P)(U) znvpair(P)(U) spl(U) zlib_deflate zavl(P)(U) zunicode(P)(U) sb_edac edac_core lpc_ich mfd_core i2c_i801 ioatdma sg igb dca i2c_algo_bit i2c_core ptp pps_core ext3 jbd mbcache sd_mod crc_t10dif ahci isci libsas wmi mpt2sas scsi_transport_sas raid_class mlx4_ib ib_sa ib_mad ib_core ib_addr ipv6 mlx4_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
&amp;lt;4&amp;gt;
&amp;lt;4&amp;gt;Pid: 6617, comm: tgt_recover_2 Tainted: P           ---------------    2.6.32-504.30.3.el6_lustre.g93f956d.x86_64 #1 Intel Corporation S2600GZ ........../S2600GZ
&amp;lt;4&amp;gt;RIP: 0010:[&amp;lt;ffffffffa0b2222c&amp;gt;]  [&amp;lt;ffffffffa0b2222c&amp;gt;] distribute_txn_get_next_transno+0x3c/0xd0 [ptlrpc]
&amp;lt;4&amp;gt;RSP: 0018:ffff88028674fca0  EFLAGS: 00010207
&amp;lt;4&amp;gt;RAX: 5a5a5a5a5a5a5a5a RBX: 0000000000000000 RCX: 0000000000000000
&amp;lt;4&amp;gt;RDX: ffff8802866a41e8 RSI: ffffffffa0a65b80 RDI: ffff8802866a4208
&amp;lt;4&amp;gt;RBP: ffff88028674fcc0 R08: 00000000fffffff2 R09: 00000000fffffff5
&amp;lt;4&amp;gt;R10: 0000000000000009 R11: 0000000000000000 R12: ffff8802866a4180
&amp;lt;4&amp;gt;R13: ffff8802866a4208 R14: ffff8802866a4180 R15: 0000000000000000
&amp;lt;4&amp;gt;FS:  0000000000000000(0000) GS:ffff88044e480000(0000) knlGS:0000000000000000
&amp;lt;4&amp;gt;CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
&amp;lt;4&amp;gt;CR2: 00007f7e64e29000 CR3: 0000000001a85000 CR4: 00000000000407e0
&amp;lt;4&amp;gt;DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
&amp;lt;4&amp;gt;DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
&amp;lt;4&amp;gt;Process tgt_recover_2 (pid: 6617, threadinfo ffff88028674e000, task ffff88028674d520)
&amp;lt;4&amp;gt;Stack:
&amp;lt;4&amp;gt; ffffffffa0bb6640 ffff88080c3fd038 0000000000000000 ffff88080c3fd3cc
&amp;lt;4&amp;gt;&amp;lt;d&amp;gt; ffff88028674fd50 ffffffffa0a65c07 ffff88028674fde0 00000000a085c87e
&amp;lt;4&amp;gt;&amp;lt;d&amp;gt; 0000000000000000 ffff8807fb3adddd 00000054a0a60fc0 ffffffffa0b52dc9
&amp;lt;4&amp;gt;Call Trace:
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a65c07&amp;gt;] check_for_next_transno+0x87/0x6d0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a65b80&amp;gt;] ? check_for_next_transno+0x0/0x6d0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a62c63&amp;gt;] target_recovery_overseer+0xb3/0x630 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a60f30&amp;gt;] ? exp_req_replay_healthy_or_from_mdt+0x0/0x40 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa077bcf1&amp;gt;] ? libcfs_debug_msg+0x41/0x50 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a62ac0&amp;gt;] ? abort_lock_replay_queue+0x30/0x120 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a693db&amp;gt;] target_recovery_thread+0x8bb/0x1dd0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffff81064c12&amp;gt;] ? default_wake_function+0x12/0x20
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a68b20&amp;gt;] ? target_recovery_thread+0x0/0x1dd0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8109e78e&amp;gt;] kthread+0x9e/0xc0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c28a&amp;gt;] child_rip+0xa/0x20
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8109e6f0&amp;gt;] ? kthread+0x0/0xc0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c280&amp;gt;] ? child_rip+0x0/0x20
&amp;lt;4&amp;gt;Code: 89 6d f8 0f 1f 44 00 00 31 db 4c 8d af 88 00 00 00 49 89 fc 4c 89 ef e8 13 b6 a0 e0 49 8b 44 24 68 49 8d 54 24 68 48 39 d0 74 04 &amp;lt;48&amp;gt; 8b 58 10 4c 89 e8 66 ff 00 66 66 90 f6 05 26 4e c7 ff 08 74 
&amp;lt;1&amp;gt;RIP  [&amp;lt;ffffffffa0b2222c&amp;gt;] distribute_txn_get_next_transno+0x3c/0xd0 [ptlrpc]
&amp;lt;4&amp;gt; RSP &amp;lt;ffff88028674fca0&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Immediately before the crash the following errors are printed to &lt;tt&gt;lola-8&apos;s&lt;/tt&gt; message file:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lola-8.log:Feb 24 00:01:06 lola-8 kernel: LustreError: 6612:0:(osp_object.c:588:osp_attr_get()) soaked-MDT0003-osp-MDT0002:osp_attr_get update error [0x200000009:0x3:0x0]: rc = -5
lola-8.log:Feb 24 00:01:06 lola-8 kernel: LustreError: 6612:0:(lod_sub_object.c:959:lod_sub_prep_llog()) soaked-MDT0002-mdtlov: can&apos;t get id from catalogs: rc = -5
lola-8.log:Feb 24 00:01:06 lola-8 kernel: LustreError: 6612:0:(lod_dev.c:419:lod_sub_recovery_thread()) soaked-MDT0003-osp-MDT0002 getting update log failed: rc = -5
...
...
lola-8.log:Feb 24 00:01:09 lola-8 kernel: LustreError: 6617:0:(update_records.c:72:update_records_dump()) master transno = 8594544408 batchid = 4299976565 flags = 0 ops = 73 params = 46
lola-8.log:Feb 24 00:01:09 lola-8 kernel: LustreError: 6617:0:(update_records.c:72:update_records_dump()) master transno = 8594544409 batchid = 4299976566 flags = 0 ops = 73 params = 46
lola-8.log:Feb 24 00:01:09 lola-8 kernel: LustreError: 6617:0:(update_records.c:72:update_records_dump()) master transno = 8594544411 batchid = 4299976567 flags = 0 ops = 73 params = 46
lola-8.log:Feb 24 00:01:09 lola-8 kernel: LustreError: 6617:0:(update_records.c:72:update_records_dump()) master transno = 8594544417 batchid = 4299976568 flags = 0 ops = 73 params = 46
lola-8.log:Feb 24 00:01:09 lola-8 kernel: general protection fault: 0000 [#1] 
lola-8.log:Feb 24 00:01:09 lola-8 kernel: LustreError: 6683:0:(ldlm_lib.c:2562:target_stop_recovery_thread()) soaked-MDT0002: Aborting recovery
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Attached files: message, console, vmcore-dmesg.txt of &lt;tt&gt;lola-8&lt;/tt&gt;.&lt;br/&gt;
Crash file is available, too.&lt;/p&gt;



</description>
                <environment>lola&lt;br/&gt;
build: &lt;a href=&quot;https://build.hpdd.intel.com/job/lustre-reviews/37481/&quot;&gt;https://build.hpdd.intel.com/job/lustre-reviews/37481/&lt;/a&gt;</environment>
        <key id="34927">LU-7809</key>
            <summary>general protection fault: 0000 during failback of MDS disk resources</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="di.wang">Di Wang</assignee>
                                    <reporter username="heckes">Frank Heckes</reporter>
                        <labels>
                            <label>soak</label>
                    </labels>
                <created>Wed, 24 Feb 2016 09:11:18 +0000</created>
                <updated>Tue, 13 Sep 2016 20:05:26 +0000</updated>
                            <resolved>Tue, 13 Sep 2016 20:05:26 +0000</resolved>
                                    <version>Lustre 2.8.0</version>
                                    <fixVersion>Lustre 2.9.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="143495" author="heckes" created="Wed, 24 Feb 2016 09:31:30 +0000"  >&lt;p&gt;Crash file has been uploaded to &lt;tt&gt;lhn.hpdd.intel.com:/scratch/crashdumps/lu-7809/lola-8/127.0.0.1-2016-02-24-00:01:25/&lt;/tt&gt;.&lt;/p&gt;</comment>
                            <comment id="143661" author="gerrit" created="Wed, 24 Feb 2016 21:54:12 +0000"  >&lt;p&gt;wangdi (di.wang@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/18651&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/18651&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7809&quot; title=&quot;general protection fault: 0000 during failback of MDS disk resources&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7809&quot;&gt;&lt;del&gt;LU-7809&lt;/del&gt;&lt;/a&gt; lod: stop recovery before destory dtrq list&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: b4892ca5b5c3787313c9256fc23add5a88d61855&lt;/p&gt;</comment>
                            <comment id="143700" author="gerrit" created="Thu, 25 Feb 2016 04:55:38 +0000"  >&lt;p&gt;wangdi (di.wang@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/18658&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/18658&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7809&quot; title=&quot;general protection fault: 0000 during failback of MDS disk resources&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7809&quot;&gt;&lt;del&gt;LU-7809&lt;/del&gt;&lt;/a&gt; lod: stop recovery before destory dtrq list&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_8&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: f8f8162b3f4dda6fc08afda91514131dbe14cc59&lt;/p&gt;</comment>
                            <comment id="144342" author="cliffw" created="Tue, 1 Mar 2016 18:59:27 +0000"  >&lt;p&gt;We are seeing this again on 2.8.0-RC2&lt;br/&gt;
Errors before panic:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028910512 batchid = 141735366499 flags = 0 ops = 5 params = 4
Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028911079 batchid = 141735366507 flags = 0 ops = 5 params = 4
Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028911542 batchid = 141735366512 flags = 0 ops = 5 params = 4
Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028911597 batchid = 141735366514 flags = 0 ops = 5 params = 4
Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028911739 batchid = 141735366519 flags = 0 ops = 5 params = 4
Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028911807 batchid = 141735366523 flags = 0 ops = 5 params = 4
Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028911819 batchid = 141735366524 flags = 0 ops = 5 params = 4
Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028911868 batchid = 141735366527 flags = 0 ops = 5 params = 4
Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028911947 batchid = 141735366535 flags = 0 ops = 5 params = 4
Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028912075 batchid = 141735366538 flags = 0 ops = 5 params = 4
Mar  1 07:54:19 lola-8 kernel: LustreError: 6942:0:(update_records.c:72:update_records_dump()) master transno = 146028912093 batchid = 141735366539 flags = 0 ops = 5 params = 4
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Panic&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;Mar  1 07:54:19 lola-8 kernel: general protection fault: 0000 [#1]
Mar  1 07:54:19 lola-8 kernel: LustreError: 7064:0:(ldlm_lib.c:2562:target_stop_recovery_thread()) soaked-MDT0003: Aborting recovery
Mar  1 07:54:19 lola-8 kernel: SMP
Mar  1 07:54:19 lola-8 kernel: last sysfs file: /sys/devices/system/cpu/online
Mar  1 07:54:19 lola-8 kernel: CPU 18
Mar  1 07:54:19 lola-8 kernel: Modules linked in: mgs(U) osp(U) mdd(U) lod(U) mdt(U) lfsck(U) mgc(U) osd_ldiskfs(U) ldiskfs(U) jbd2 lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic crc32c_intel libcfs(U) 8021q garp stp llc nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm scsi_dh_rdac dm_round_robin dm_multipath iTCO_wdt iTCO_vendor_support microcode zfs(P)(U) zcommon(P)(U) znvpair(P)(U) spl(U) zlib_deflate zavl(P)(U) zunicode(P)(U) sb_edac edac_core lpc_ich mfd_core i2c_i801 ioatdma sg igb dca i2c_algo_bit i2c_core ptp pps_core ext3 jbd mbcache sd_mod crc_t10dif ahci isci libsas wmi mpt2sas scsi_transport_sas raid_class mlx4_ib ib_sa ib_mad ib_core ib_addr ipv6 mlx4_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Mar  1 07:54:19 lola-8 kernel:
Mar  1 07:54:19 lola-8 kernel: Pid: 6942, comm: tgt_recover_3 Tainted: P           ---------------    2.6.32-504.30.3.el6_lustre.x86_64 #1 Intel Corporation S2600GZ ........../S2600GZ
Mar  1 07:54:19 lola-8 kernel: RIP: 0010:[&amp;lt;ffffffffa0b2121c&amp;gt;]  [&amp;lt;ffffffffa0b2121c&amp;gt;] distribute_txn_get_next_transno+0x3c/0xd0 [ptlrpc]
Mar  1 07:54:19 lola-8 kernel: RSP: 0018:ffff8807e068dca0  EFLAGS: 00010203
Mar  1 07:54:19 lola-8 kernel: RAX: 5a5a5a5a5a5a5a5a RBX: 0000000000000000 RCX: 0000000000000000
Mar  1 07:54:19 lola-8 kernel: RDX: ffff880304ca7328 RSI: ffffffffa0a64b80 RDI: ffff880304ca7348
Mar  1 07:54:19 lola-8 kernel: RBP: ffff8807e068dcc0 R08: 00000000fffffff0 R09: 00000000fffffff3
Mar  1 07:54:19 lola-8 kernel: R10: 000000000000000b R11: 0000000000000000 R12: ffff880304ca72c0
Mar  1 07:54:19 lola-8 kernel: R13: ffff880304ca7348 R14: ffff880304ca72c0 R15: 0000000000000000
Mar  1 07:54:19 lola-8 kernel: FS:  0000000000000000(0000) GS:ffff880038340000(0000) knlGS:0000000000000000
Mar  1 07:54:19 lola-8 kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Mar  1 07:54:19 lola-8 kernel: CR2: 0000003bd24acd50 CR3: 0000000001a85000 CR4: 00000000000407e0
Mar  1 07:54:19 lola-8 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar  1 07:54:19 lola-8 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar  1 07:54:19 lola-8 kernel: &lt;span class=&quot;code-object&quot;&gt;Process&lt;/span&gt; tgt_recover_3 (pid: 6942, threadinfo ffff8807e068c000, task ffff8807fca66ab0)
Mar  1 07:54:19 lola-8 kernel: Stack:
Mar  1 07:54:19 lola-8 kernel: ffffffffa0bb53e0 ffff8803ea51e078 0000000000000000 ffff8803ea51e40c
Mar  1 07:54:19 lola-8 kernel: &amp;lt;d&amp;gt; ffff8807e068dd50 ffffffffa0a64c07 ffff8807e068dde0 00000000a085c7be
Mar  1 07:54:19 lola-8 kernel: &amp;lt;d&amp;gt; 0000000000000000 ffff8803e9d596da 00000054a0a5ffc0 ffffffffa0b51dad
Mar  1 07:54:19 lola-8 kernel: Call Trace:
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffffa0a64c07&amp;gt;] check_for_next_transno+0x87/0x6d0 [ptlrpc]
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffffa0a64b80&amp;gt;] ? check_for_next_transno+0x0/0x6d0 [ptlrpc]
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffffa0a61c63&amp;gt;] target_recovery_overseer+0xb3/0x630 [ptlrpc]
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffffa0a5ff30&amp;gt;] ? exp_req_replay_healthy_or_from_mdt+0x0/0x40 [ptlrpc]
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffffa077bcf1&amp;gt;] ? libcfs_debug_msg+0x41/0x50 [libcfs]
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffffa0a61ac0&amp;gt;] ? abort_lock_replay_queue+0x30/0x120 [ptlrpc]
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffffa0a683db&amp;gt;] target_recovery_thread+0x8bb/0x1dd0 [ptlrpc]
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffff81064c12&amp;gt;] ? default_wake_function+0x12/0x20
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffffa0a67b20&amp;gt;] ? target_recovery_thread+0x0/0x1dd0 [ptlrpc]
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffff8109e78e&amp;gt;] kthread+0x9e/0xc0
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffff8100c28a&amp;gt;] child_rip+0xa/0x20
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffff8109e6f0&amp;gt;] ? kthread+0x0/0xc0
Mar  1 07:54:19 lola-8 kernel: [&amp;lt;ffffffff8100c280&amp;gt;] ? child_rip+0x0/0x20
Mar  1 07:54:19 lola-8 kernel: Code: 89 6d f8 0f 1f 44 00 00 31 db 4c 8d af 88 00 00 00 49 89 fc 4c 89 ef e8 23 c6 a0 e0 49 8b 44 24 68 49 8d 54 24 68 48 39 d0 74 04 &amp;lt;48&amp;gt; 8b 58 10 4c 89 e8 66 ff 00 66 66 90 f6 05 d6 5d c7 ff 08 74
Mar  1 07:54:19 lola-8 kernel: RIP  [&amp;lt;ffffffffa0b2121c&amp;gt;] distribute_txn_get_next_transno+0x3c/0xd0 [ptlrpc]
Mar  1 07:54:19 lola-8 kernel: RSP &amp;lt;ffff8807e068dca0&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="144397" author="heckes" created="Wed, 2 Mar 2016 17:07:51 +0000"  >&lt;p&gt;Crash happens again for b2_8 RC4 (see &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160302&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160302&lt;/a&gt;) while umounting the MDTs on node &lt;tt&gt;lola-9&lt;/tt&gt;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;&amp;lt;4&amp;gt;general protection fault: 0000 [#1]
&amp;lt;3&amp;gt;LustreError: 4715:0:(ldlm_lib.c:2562:target_stop_recovery_thread()) soaked-MDT0002: Aborting recovery
&amp;lt;4&amp;gt;SMP
&amp;lt;4&amp;gt;last sysfs file: /sys/devices/system/cpu/online
&amp;lt;4&amp;gt;CPU 12
&amp;lt;4&amp;gt;Modules linked in: osp(U) mdd(U) lod(U) mdt(U) lfsck(U) mgc(U) osd_ldiskfs(U) ldiskfs(U) jbd2 lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic crc32c_intel libcfs(U) 8021q garp stp llc nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm scsi_dh_rdac dm_round_robin dm_multipath microcode iTCO_wdt iTCO_vendor_support zfs(P)(U) zcommon(P)(U) znvpair(P)(U) spl(U) zlib_deflate zavl(P)(U) zunicode(P)(U) sb_edac edac_core lpc_ich mfd_core i2c_i801 ioatdma sg igb dca i2c_algo_bit i2c_core ptp pps_core ext3 jbd mbcache sd_mod crc_t10dif isci libsas ahci mpt2sas scsi_transport_sas raid_class mlx4_ib ib_sa ib_mad ib_core ib_addr ipv6 mlx4_core wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
&amp;lt;4&amp;gt;
&amp;lt;4&amp;gt;Pid: 4496, comm: tgt_recover_2 Tainted: P           ---------------    2.6.32-504.30.3.el6_lustre.x86_64 #1 Intel Corporation S2600GZ ........../S2600GZ
&amp;lt;4&amp;gt;RIP: 0010:[&amp;lt;ffffffffa0aec290&amp;gt;]  [&amp;lt;ffffffffa0aec290&amp;gt;] distribute_txn_get_next_transno+0xb0/0xd0 [ptlrpc]
&amp;lt;4&amp;gt;RSP: 0018:ffff8808224c9d30  EFLAGS: 00010202
&amp;lt;4&amp;gt;RAX: 5a5a5a5a5a5a5a5a RBX: 0000000000000000 RCX: 0000000000000000
&amp;lt;4&amp;gt;RDX: ffff8803cd1bd4e8 RSI: ffffffffa0b1a784 RDI: ffffffffa0b9f380
&amp;lt;4&amp;gt;RBP: ffff8808224c9d50 R08: 00000000fffffffb R09: 00000000fffffffe
&amp;lt;4&amp;gt;R10: 0000000000000000 R11: 0000000000000000 R12: ffff8803cd1bd480
&amp;lt;4&amp;gt;R13: ffff8803cd1bd508 R14: ffff8803cd2413e0 R15: ffff8803cd241038
&amp;lt;4&amp;gt;FS:  0000000000000000(0000) GS:ffff88044e480000(0000) knlGS:0000000000000000
&amp;lt;4&amp;gt;CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
&amp;lt;4&amp;gt;CR2: 00007f1efe1ac000 CR3: 0000000001a85000 CR4: 00000000000407e0
&amp;lt;4&amp;gt;DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
&amp;lt;4&amp;gt;DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
&amp;lt;4&amp;gt;Process tgt_recover_2 (pid: 4496, threadinfo ffff8808224c8000, task ffff8808224cf520)
&amp;lt;4&amp;gt;Stack:
&amp;lt;4&amp;gt; ffff8808224c9dd0 ffff8803cd3af0b0 ffffffffa0a2fb80 ffff8808224c9dd0
&amp;lt;4&amp;gt;&amp;lt;d&amp;gt; ffff8808224c9e30 ffffffffa0a2ce2a ffff8808224c9dd0 0000000000000286
&amp;lt;4&amp;gt;&amp;lt;d&amp;gt; 0000000000000064 0000000056d7154b ffff8808224c9de0 ffff8808224cf520
&amp;lt;4&amp;gt;Call Trace:
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a2fb80&amp;gt;] ? check_for_next_transno+0x0/0x6d0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a2ce2a&amp;gt;] target_recovery_overseer+0x27a/0x630 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a2af30&amp;gt;] ? exp_req_replay_healthy_or_from_mdt+0x0/0x40 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0aec827&amp;gt;] ? dtrq_destroy+0x497/0x630 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a333db&amp;gt;] target_recovery_thread+0x8bb/0x1dd0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0a32b20&amp;gt;] ? target_recovery_thread+0x0/0x1dd0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8109e78e&amp;gt;] kthread+0x9e/0xc0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c28a&amp;gt;] child_rip+0xa/0x20
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8109e6f0&amp;gt;] ? kthread+0x0/0xc0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c280&amp;gt;] ? child_rip+0x0/0x20
&amp;lt;4&amp;gt;Code: 02 00 00 48 c7 c6 84 a7 b1 a0 48 c7 05 26 31 0b 00 00 00 00 00 c7 05 14 31 0b 00 00 00 08 00 48 c7 c7 80 f3 b9 a0 49 8b 44 24 10 &amp;lt;48&amp;gt; 8b 10 31 c0 48 83 c2 40 e8 12 aa c5 ff 48 89 d8 4c 8b 65 f0
&amp;lt;1&amp;gt;RIP  [&amp;lt;ffffffffa0aec290&amp;gt;] distribute_txn_get_next_transno+0xb0/0xd0 [ptlrpc]
&amp;lt;4&amp;gt; RSP &amp;lt;ffff8808224c9d30&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;crash dump file can be provided on demand.&lt;/p&gt;</comment>
                            <comment id="159302" author="heckes" created="Wed, 20 Jul 2016 12:46:52 +0000"  >&lt;p&gt;The error didn&apos;t occurred for soak test of build &lt;a href=&quot;https://build.hpdd.intel.com/job/lustre-master/3406&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://build.hpdd.intel.com/job/lustre-master/3406&lt;/a&gt; (see &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160713&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160713&lt;/a&gt;) during a test session that is ongoing and last already for 7 days.&lt;/p&gt;</comment>
                            <comment id="165913" author="gerrit" created="Tue, 13 Sep 2016 20:02:19 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/18651/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/18651/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7809&quot; title=&quot;general protection fault: 0000 during failback of MDS disk resources&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7809&quot;&gt;&lt;del&gt;LU-7809&lt;/del&gt;&lt;/a&gt; lod: stop recovery before destory dtrq list&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: f2892fda72897a8a264414c06e54751d127a5709&lt;/p&gt;</comment>
                            <comment id="165923" author="pjones" created="Tue, 13 Sep 2016 20:05:26 +0000"  >&lt;p&gt;Landed for 2.9&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="37815">LU-8325</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="20509" name="console-lola-8.log.bz2" size="197944" author="heckes" created="Wed, 24 Feb 2016 09:34:10 +0000"/>
                            <attachment id="20510" name="messages-lola-8.log.bz2" size="66548" author="heckes" created="Wed, 24 Feb 2016 09:34:10 +0000"/>
                            <attachment id="20511" name="vmcore-dmesg.txt.bz2" size="33132" author="heckes" created="Wed, 24 Feb 2016 09:34:10 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzy2jz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>