<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:38:21 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-10806] Hard crash when mounting DNE MDT</title>
                <link>https://jira.whamcloud.com/browse/LU-10806</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Attempting to re-mount the filesystem after the upgrade, Have a hard crash on MDT0001.&#160;&lt;/p&gt;

&lt;p&gt;Crash is repeatable. I will leave the system in this state for examination, then re-format non-DNE.&lt;/p&gt;

&lt;p&gt;Crash dumps are available on soak&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt; [  451.170602] LDISKFS-fs warning (device dm-2): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait.[  493.737484] LDISKFS-fs (dm-2): recovery complete
[  493.793102] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc
[  495.357987] LustreError: 2384:0:(tgt_lastrcvd.c:1533:tgt_clients_data_init()) soaked-MDT0001: duplicate export &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; client generation 11
[  495.646489] LustreError: 2384:0:(obd_config.c:559:class_setup()) setup soaked-MDT0001 failed (-114)
[  495.646493] LustreError: 2384:0:(obd_config.c:1822:class_config_llog_handler()) MGC192.168.1.108@o2ib: cfg command failed: rc = -114
[  495.646497] Lustre:    cmd=cf003 0:soaked-MDT0001  1:soaked-MDT0001_UUID  2:1  3:soaked-MDT0001-mdtlov  4:f[  495.646570] LustreError: 15c-8: MGC192.168.1.108@o2ib: The configuration from log &lt;span class=&quot;code-quote&quot;&gt;&apos;soaked-MDT0001&apos;&lt;/span&gt; failed (-114). This may be the result of communication errors between &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; node and the MGS, a bad configuration, or other errors. See the syslog &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; more information.
[  495.646587] LustreError: 2303:0:(obd_mount_server.c:1383:server_start_targets()) failed to start server soaked-MDT0001: -114
[  495.646728] LustreError: 2303:0:(obd_mount_server.c:1936:server_fill_super()) Unable to start targets: -114
[  495.646760] LustreError: 2303:0:(obd_config.c:610:class_cleanup()) Device 4 not setup
[  495.899986] BUG: unable to handle kernel NULL pointer dereference at 0000000000000378
[  495.899999] IP: [&amp;lt;ffffffff816b683c&amp;gt;] _raw_spin_lock+0xc/0x30
[  495.900002] PGD 0
[  495.900005] Oops: 0002 [#1] SMP
[  495.900073] Modules linked in: mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_ib(OE) mlx5_core(OE) mlx4_en(OE) sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd dm_round_robin iTCO_wdt iTCO_vendor_support ipmi_ssif sg joydev ipmi_si ipmi_devintf mei_me ioatdma ipmi_msghandler pcspkr wmi mei lpc_ich shpchp i2c_i801 dm_multipath
[  495.900107]  dm_mod nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mlx4_ib(OE) ib_core(OE) mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops isci ahci igb mpt2sas libsas ttm libahci ptp crct10dif_pclmul pps_core crct10dif_common mlx4_core(OE) raid_class drm libata crc32c_intel dca mlx_compat(OE) scsi_transport_sas i2c_algo_bit devlink i2c_core

[  495.900113] CPU: 10 PID: 2167 Comm: obd_zombid Tainted: P           OE  ------------   3.10.0-693.21.1.el7_lustre.x86_64 #1
[  495.900114] Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[  495.900117] task: ffff880036358fd0 ti: ffff8804176d0000 task.ti: ffff8804176d0000
[  495.900122] RIP: 0010:[&amp;lt;ffffffff816b683c&amp;gt;]  [&amp;lt;ffffffff816b683c&amp;gt;] _raw_spin_lock+0xc/0x30
[  495.900124] RSP: 0018:ffff8804176d3da8  EFLAGS: 00010246
[  495.900126] RAX: 0000000000000000 RBX: ffff88081503c800 RCX: 000000018040003f
[  495.900128] RDX: 0000000000000001 RSI: ffffea0020556b00 RDI: 0000000000000378
[  495.900129] RBP: ffff8804176d3de0 R08: ffff8808155acf00 R09: 000000018040003f
[  495.900131] R10: 0000000000000001 R11: ffffea0020556b00 R12: 0000000000000000
[  495.900133] R13: 0000000000000378 R14: ffff880817131068 R15: ffff88081503c800
[  495.900135] FS:  0000000000000000(0000) GS:ffff88082d880000(0000) knlGS:0000000000000000
[  495.900137] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  495.900139] CR2: 0000000000000378 CR3: 0000000001a02000 CR4: 00000000000607e0
[  495.900141] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  495.900143] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  495.900144] Call Trace:
[  495.900246]  [&amp;lt;ffffffffc0d6b635&amp;gt;] ? tgt_grant_discard+0x35/0x190 [ptlrpc]
[  495.900317]  [&amp;lt;ffffffffc0d3f74e&amp;gt;] ? tgt_client_free+0x17e/0x3b0 [ptlrpc]
[  495.900354]  [&amp;lt;ffffffffc177c097&amp;gt;] mdt_destroy_export+0x87/0x200 [mdt]
[  495.900410]  [&amp;lt;ffffffffc0a7b9be&amp;gt;] class_export_destroy+0xee/0x490 [obdclass]
[  495.900448]  [&amp;lt;ffffffffc0a8434a&amp;gt;] obd_zombie_impexp_cull+0x39a/0x550 [obdclass]
[  495.900479]  [&amp;lt;ffffffffc0a8456d&amp;gt;] obd_zombie_impexp_thread+0x6d/0x1c0 [obdclass]
[  495.900489]  [&amp;lt;ffffffff810c7c70&amp;gt;] ? wake_up_state+0x20/0x20
[  495.900519]  [&amp;lt;ffffffffc0a84500&amp;gt;] ? obd_zombie_impexp_cull+0x550/0x550 [obdclass]
[  495.900526]  [&amp;lt;ffffffff810b4031&amp;gt;] kthread+0xd1/0xe0
[  495.900530]  [&amp;lt;ffffffff810b3f60&amp;gt;] ? insert_kthread_work+0x40/0x40
[  495.900537]  [&amp;lt;ffffffff816c0577&amp;gt;] ret_from_fork+0x77/0xb0
[  495.900541]  [&amp;lt;ffffffff810b3f60&amp;gt;] ? insert_kthread_work+0x40/0x40
[  495.900576] Code: 5d c3 0f 1f 44 00 00 85 d2 74 e4 0f 1f 40 00 eb ed 66 0f 1f 44 00 00 b8 01 00 00 00 5d c3 90 66 66 66 66 90 31 c0 ba 01 00 00 00 &amp;lt;f0&amp;gt; 0f b1 17 85 c0 75 01 c3 55 89 c6 48 89 e5 e8 99 27 ff ff 5d
[  495.900580] RIP  [&amp;lt;ffffffff816b683c&amp;gt;] _raw_spin_lock+0xc/0x30
[  495.900581]  RSP &amp;lt;ffff8804176d3da8&amp;gt;
[  495.900582] CR2: 0000000000000378


&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>Soak stress cluster - lustre-master-ib build 64 version=2.10.58_139_g630cd49</environment>
        <key id="51316">LU-10806</key>
            <summary>Hard crash when mounting DNE MDT</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="cliffw">Cliff White</reporter>
                        <labels>
                    </labels>
                <created>Mon, 12 Mar 2018 17:57:26 +0000</created>
                <updated>Sat, 6 Oct 2018 13:20:41 +0000</updated>
                            <resolved>Sat, 6 Oct 2018 13:20:41 +0000</resolved>
                                    <version>Lustre 2.11.0</version>
                    <version>Lustre 2.12.0</version>
                                    <fixVersion>Lustre 2.12.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="223367" author="pjones" created="Mon, 12 Mar 2018 21:14:38 +0000"  >&lt;p&gt;Lai&lt;/p&gt;

&lt;p&gt;Could you please investigate this issue?&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="223581" author="laisiyao" created="Wed, 14 Mar 2018 09:20:05 +0000"  >&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[  495.357987] LustreError: 2384:0:(tgt_lastrcvd.c:1533:tgt_clients_data_init()) soaked-MDT0001: duplicate export for client generation 11&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;shows the last_rcvd file is corrupt: two client data have the same generation. This caused mount failure, and the error handling code triggers the crash. So this is an bug in error handling code. I&apos;ll try to reproduce it and see how to fix it.&lt;/p&gt;

&lt;p&gt;I have a question about the upgrade: what is its original version?&lt;/p&gt;</comment>
                            <comment id="223629" author="cliffw" created="Wed, 14 Mar 2018 18:10:49 +0000"  >&lt;p&gt;The previous version was&#160;&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;latest master, plus:&#160;&lt;a href=&quot;https://review.whamcloud.com/#/c/31475/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/31475/&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;Lustre&#160;version=2.10.58_76_gbe9f2ee&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="223812" author="laisiyao" created="Fri, 16 Mar 2018 03:39:34 +0000"  >&lt;p&gt;Do other MDT&apos;s mount successfully? And it&apos;s best to know whether this is reproducible.&lt;/p&gt;</comment>
                            <comment id="233543" author="sarah" created="Fri, 14 Sep 2018 18:45:55 +0000"  >&lt;p&gt;Hit the similar error when running with lustre-master version=2.11.54_103_gdeb5aba for about 2 and half days:&lt;/p&gt;

&lt;p&gt;MDS console&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;soak-11 login: [  189.069605] LNet: HW NUMA nodes: 2, HW CPU cores: 32, npartitions: 2
[  189.079399] alg: No test for adler32 (adler32-zlib)
[  189.986576] Lustre: Lustre: Build Version: 2.11.54_103_gdeb5aba
[  190.277235] LNet: Using FMR for registration
[  190.294638] LNet: Added LNI 192.168.1.111@o2ib [8/256/0/180]
[  190.473995] LDISKFS-fs warning (device dm-5): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait.
[  190.473995] 
[  232.980161] LDISKFS-fs (dm-5): recovery complete
[  232.985552] LDISKFS-fs (dm-5): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc
[  234.661896] LustreError: 4231:0:(tgt_lastrcvd.c:1540:tgt_clients_data_init()) soaked-MDT0003: duplicate export for client generation 5
[  234.790360] LustreError: 4231:0:(obd_config.c:559:class_setup()) setup soaked-MDT0003 failed (-114)
[  234.800544] LustreError: 4231:0:(obd_config.c:1835:class_config_llog_handler()) MGC192.168.1.108@o2ib: cfg command failed: rc = -114
[  234.813900] Lustre:    cmd=cf003 0:soaked-MDT0003  1:soaked-MDT0003_UUID  2:3  3:soaked-MDT0003-mdtlov  4:f  
[  234.813900] 
[  234.826773] LustreError: 15c-8: MGC192.168.1.108@o2ib: The configuration from log &apos;soaked-MDT0003&apos; failed (-114). This may be the result of communication errors betw
een this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
[  234.853307] LustreError: 4077:0:(obd_mount_server.c:1386:server_start_targets()) failed to start server soaked-MDT0003: -114
[  234.866025] LustreError: 4077:0:(obd_mount_server.c:1939:server_fill_super()) Unable to start targets: -114
[  234.877033] LustreError: 4077:0:(obd_config.c:610:class_cleanup()) Device 4 not setup
[  234.890447] BUG: unable to handle kernel NULL pointer dereference at 0000000000000380
[  234.899277] IP: [&amp;lt;ffffffffb6d1682c&amp;gt;] _raw_spin_lock+0xc/0x30
[  234.905658] PGD 0 
[  234.907933] Oops: 0002 [#1] SMP 
[  234.911578] Modules linked in: mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) 
lnet(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_uma
d(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) dm_round_robin zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) sb_edac intel_powerclam
p coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_ssif iTCO_wdt iTCO_vendor_sup
port i2c_i801 sg ipmi_si joydev mei_me mei lpc_ich ipmi_devintf ipmi_msghandler pcspkr shpchp ioatdma wmi dm_multipath dm_mod auth_rpcgss sunrpc ip_tables ext4 mbcache 
jbd2 sd_mod crc_t10dif crct10dif_generic mlx4_ib(OE) ib_core(OE) mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm igb isci ptp mlx4_core(OE) mpt
3sas ahci pps_core drm libsas libahci devlink crct10dif_pclmul crct10dif_common dca crc32c_intel raid_class i2c_algo_bit libata mlx_compat(OE) i2c_core scsi_transport_s
as
[  235.028639] CPU: 19 PID: 230 Comm: kworker/19:1 Tainted: P           OE  ------------   3.10.0-862.9.1.el7_lustre.x86_64 #1
[  235.041116] Hardware name: Intel Corporation SandyBridge Platform/To be filled by O.E.M., BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[  235.055386] Workqueue: obd_zombid obd_zombie_exp_cull [obdclass]
[  235.062133] task: ffff945dabe1cf10 ti: ffff945dabe3c000 task.ti: ffff945dabe3c000
[  235.070513] RIP: 0010:[&amp;lt;ffffffffb6d1682c&amp;gt;]  [&amp;lt;ffffffffb6d1682c&amp;gt;] _raw_spin_lock+0xc/0x30
[  235.081017] RSP: 0018:ffff945dabe3fd98  EFLAGS: 00010246
[  235.088351] RAX: 0000000000000000 RBX: ffff945d7e112c00 RCX: 0000000000000956
[  235.097753] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000380
[  235.107136] RBP: ffff945dabe3fdd0 R08: 000000000001bac0 R09: ffffffffc15bf8ae
[  235.116518] R10: ffff945dae2dbac0 R11: ffffde280ff93b00 R12: 0000000000000000
[  235.125883] R13: 0000000000000380 R14: ffff945d7ffa1040 R15: 00000000000004c0
[  235.135230] FS:  0000000000000000(0000) GS:ffff945dae2c0000(0000) knlGS:0000000000000000
[  235.145653] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  235.153454] CR2: 0000000000000380 CR3: 0000000462a0e000 CR4: 00000000000607e0
[  235.162816] Call Trace:
[  235.166982]  [&amp;lt;ffffffffc15ec0c5&amp;gt;] ? tgt_grant_discard+0x35/0x190 [ptlrpc]
[  235.175956]  [&amp;lt;ffffffffc15bf8ae&amp;gt;] ? tgt_client_free+0x17e/0x3b0 [ptlrpc]
[  235.185758]  [&amp;lt;ffffffffc1870097&amp;gt;] mdt_destroy_export+0x87/0x200 [mdt]
[  235.195292]  [&amp;lt;ffffffffc134d4fe&amp;gt;] class_export_destroy+0xee/0x490 [obdclass]
[  235.205422]  [&amp;lt;ffffffffc134d8b5&amp;gt;] obd_zombie_exp_cull+0x15/0x20 [obdclass]
[  235.215313]  [&amp;lt;ffffffffb66b35ef&amp;gt;] process_one_work+0x17f/0x440
[  235.223979]  [&amp;lt;ffffffffb66b4686&amp;gt;] worker_thread+0x126/0x3c0
[  235.232344]  [&amp;lt;ffffffffb66b4560&amp;gt;] ? manage_workers.isra.24+0x2a0/0x2a0
[  235.241768]  [&amp;lt;ffffffffb66bb621&amp;gt;] kthread+0xd1/0xe0
[  235.249282]  [&amp;lt;ffffffffb66bb550&amp;gt;] ? insert_kthread_work+0x40/0x40
[  235.258143]  [&amp;lt;ffffffffb6d205f7&amp;gt;] ret_from_fork_nospec_begin+0x21/0x21
[  235.267438]  [&amp;lt;ffffffffb66bb550&amp;gt;] ? insert_kthread_work+0x40/0x40
[  235.276219] Code: 5d c3 0f 1f 44 00 00 85 d2 74 e4 0f 1f 40 00 eb ed 66 0f 1f 44 00 00 b8 01 00 00 00 5d c3 90 66 66 66 66 90 31 c0 ba 01 00 00 00 &amp;lt;f0&amp;gt; 0f b1 17 85 c0 75 01 c3 55 89 c6 48 89 e5 e8 c5 2c ff ff 5d 
[  235.302152] RIP  [&amp;lt;ffffffffb6d1682c&amp;gt;] _raw_spin_lock+0xc/0x30
[  235.310516]  RSP &amp;lt;ffff945dabe3fd98&amp;gt;
[  235.316349] CR2: 0000000000000380
[  235.321927] ---[ end trace c992470b75e3279d ]---
[  235.402279] Kernel panic - not syncing: Fatal exception
[  235.410133] Kernel Offset: 0x35600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  235.494959] ------------[ cut here ]------------
[  235.501354] WARNING: CPU: 19 PID: 230 at arch/x86/kernel/smp.c:127 native_smp_send_reschedule+0x65/0x70
[  235.513010] Modules linked in: mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) 
lnet(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_uma
d(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) dm_round_robin zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) sb_edac intel_powerclam
p coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_ssif iTCO_wdt iTCO_vendor_sup
port i2c_i801 sg ipmi_si joydev mei_me mei lpc_ich ipmi_devintf ipmi_msghandler pcspkr shpchp ioatdma wmi dm_multipath dm_mod auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mlx4_ib(OE) ib_core(OE) mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm igb isci ptp mlx4_core(OE) mpt3sas ahci pps_core drm libsas libahci devlink crct10dif_pclmul crct10dif_common dca crc32c_intel raid_class i2c_algo_bit libata mlx_compat(OE) i2c_core scsi_transport_sas
[  235.641969] CPU: 19 PID: 230 Comm: kworker/19:1 Tainted: P      D    OE  ------------   3.10.0-862.9.1.el7_lustre.x86_64 #1
[  235.655571] Hardware name: Intel Corporation SandyBridge Platform/To be filled by O.E.M., BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[  235.672103] Workqueue: obd_zombid obd_zombie_exp_cull [obdclass]
[  235.679970] Call Trace:
[  235.683848]  &amp;lt;IRQ&amp;gt;  [&amp;lt;ffffffffb6d0e84e&amp;gt;] dump_stack+0x19/0x1b
[  235.691458]  [&amp;lt;ffffffffb6691e18&amp;gt;] __warn+0xd8/0x100
[  235.698043]  [&amp;lt;ffffffffb6691f5d&amp;gt;] warn_slowpath_null+0x1d/0x20
[  235.705703]  [&amp;lt;ffffffffb6654e95&amp;gt;] native_smp_send_reschedule+0x65/0x70
[  235.714144]  [&amp;lt;ffffffffb66ddf81&amp;gt;] trigger_load_balance+0x191/0x280
[  235.722184]  [&amp;lt;ffffffffb66cdc0a&amp;gt;] scheduler_tick+0x10a/0x150
[  235.729649]  [&amp;lt;ffffffffb6701c10&amp;gt;] ? tick_sched_do_timer+0x50/0x50
[  235.737599]  [&amp;lt;ffffffffb66a4f65&amp;gt;] update_process_times+0x65/0x80
[  235.745437]  [&amp;lt;ffffffffb6701a10&amp;gt;] tick_sched_handle+0x30/0x70
[  235.752977]  [&amp;lt;ffffffffb6701c49&amp;gt;] tick_sched_timer+0x39/0x80
[  235.760423]  [&amp;lt;ffffffffb66bf7e6&amp;gt;] __hrtimer_run_queues+0xd6/0x260
[  235.768326]  [&amp;lt;ffffffffb66bfd7f&amp;gt;] hrtimer_interrupt+0xaf/0x1d0
[  235.775951]  [&amp;lt;ffffffffb665847b&amp;gt;] local_apic_timer_interrupt+0x3b/0x60
[  235.784346]  [&amp;lt;ffffffffb6d25063&amp;gt;] smp_apic_timer_interrupt+0x43/0x60
[  235.792545]  [&amp;lt;ffffffffb6d217b2&amp;gt;] apic_timer_interrupt+0x162/0x170
[  235.800552]  &amp;lt;EOI&amp;gt;  [&amp;lt;ffffffffb6d08c3d&amp;gt;] ? panic+0x1d5/0x21f
[  235.808001]  [&amp;lt;ffffffffb6d08ba1&amp;gt;] ? panic+0x139/0x21f
[  235.814751]  [&amp;lt;ffffffffb6d18745&amp;gt;] oops_end+0xc5/0xe0
[  235.821391]  [&amp;lt;ffffffffb6d0807e&amp;gt;] no_context+0x285/0x2a8
[  235.828408]  [&amp;lt;ffffffffb6d08115&amp;gt;] __bad_area_nosemaphore+0x74/0x1d1
[  235.836493]  [&amp;lt;ffffffffb6d08286&amp;gt;] bad_area_nosemaphore+0x14/0x16
[  235.844301]  [&amp;lt;ffffffffb6d1b6e0&amp;gt;] __do_page_fault+0x330/0x4f0
[  235.851818]  [&amp;lt;ffffffffb66db5e8&amp;gt;] ? enqueue_task_fair+0x208/0x6c0
[  235.859687]  [&amp;lt;ffffffffb6d1b8d5&amp;gt;] do_page_fault+0x35/0x90
[  235.866774]  [&amp;lt;ffffffffb6d17758&amp;gt;] page_fault+0x28/0x30
[  235.873635]  [&amp;lt;ffffffffc15bf8ae&amp;gt;] ? tgt_client_free+0x17e/0x3b0 [ptlrpc]
[  235.882169]  [&amp;lt;ffffffffb6d1682c&amp;gt;] ? _raw_spin_lock+0xc/0x30
[  235.889469]  [&amp;lt;ffffffffc15ec0c5&amp;gt;] ? tgt_grant_discard+0x35/0x190 [ptlrpc]
[  235.898128]  [&amp;lt;ffffffffc15bf8ae&amp;gt;] ? tgt_client_free+0x17e/0x3b0 [ptlrpc]
[  235.906674]  [&amp;lt;ffffffffc1870097&amp;gt;] mdt_destroy_export+0x87/0x200 [mdt]
[  235.914904]  [&amp;lt;ffffffffc134d4fe&amp;gt;] class_export_destroy+0xee/0x490 [obdclass]
[  235.923792]  [&amp;lt;ffffffffc134d8b5&amp;gt;] obd_zombie_exp_cull+0x15/0x20 [obdclass]
[  235.932481]  [&amp;lt;ffffffffb66b35ef&amp;gt;] process_one_work+0x17f/0x440
[  235.939970]  [&amp;lt;ffffffffb66b4686&amp;gt;] worker_thread+0x126/0x3c0
[  235.947140]  [&amp;lt;ffffffffb66b4560&amp;gt;] ? manage_workers.isra.24+0x2a0/0x2a0
[  235.955369]  [&amp;lt;ffffffffb66bb621&amp;gt;] kthread+0xd1/0xe0
[  235.961715]  [&amp;lt;ffffffffb66bb550&amp;gt;] ? insert_kthread_work+0x40/0x40
[  235.969404]  [&amp;lt;ffffffffb6d205f7&amp;gt;] ret_from_fork_nospec_begin+0x21/0x21
[  235.977546]  [&amp;lt;ffffffffb66bb550&amp;gt;] ? insert_kthread_work+0x40/0x40
[  235.985195] ---[ end trace c992470b75e3279e ]---
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="233980" author="sarah" created="Tue, 25 Sep 2018 16:21:42 +0000"  >&lt;p&gt;soak hit this problem again after about 24h running on tag-2.11.55&lt;/p&gt;</comment>
                            <comment id="234008" author="gerrit" created="Wed, 26 Sep 2018 13:04:57 +0000"  >&lt;p&gt;Alexandr Boyko (c17825@cray.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/33240&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33240&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10806&quot; title=&quot;Hard crash when mounting DNE MDT&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10806&quot;&gt;&lt;del&gt;LU-10806&lt;/del&gt;&lt;/a&gt; target: skip discard for a missing obt_lut&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 9935a66421e5b109dc90a4caaa6d83ebac31cde0&lt;/p&gt;</comment>
                            <comment id="234009" author="aboyko" created="Wed, 26 Sep 2018 13:11:56 +0000"  >&lt;p&gt;The patch fixes the crash during mount for a last_rcvd duplicate generation. But I&apos;m thinking that the real problem is two different records with a same generation at last_rcvd.&lt;/p&gt;</comment>
                            <comment id="234466" author="gerrit" created="Fri, 5 Oct 2018 22:26:07 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/33240/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33240/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10806&quot; title=&quot;Hard crash when mounting DNE MDT&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10806&quot;&gt;&lt;del&gt;LU-10806&lt;/del&gt;&lt;/a&gt; target: skip discard for a missing obt_lut&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 5ed65fd0594741e69999216db27d85f7f6f7f5d6&lt;/p&gt;</comment>
                            <comment id="234524" author="pjones" created="Sat, 6 Oct 2018 13:20:41 +0000"  >&lt;p&gt;Landed for 2.12&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="52939">LU-11232</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="29803" name="vmcore-dmesg.txt" size="132462" author="cliffw" created="Mon, 12 Mar 2018 17:58:47 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzzuan:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>