<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:39:29 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-10935] sanityn test 16a hangs with &#8216;NMI watchdog: BUG: soft lockup&#8217;</title>
                <link>https://jira.whamcloud.com/browse/LU-10935</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;sanityn test_16a hangs. The last thing seen in the client test_log is&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;== sanityn test 16a: 500 iterations of dual-mount fsx ================================================ 03:55:24 (1524196524)
CMD: trevis-4vm9 /usr/sbin/lctl get_param -n lod.lustre-MDT0000*.stripesize
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Core file can be found in /scratch/dumps/trevis-4vm8.trevis.hpdd.intel.com/10.9.4.36-2018-04-20-03:58:14&lt;/p&gt;

&lt;p&gt;In the OST console, we can see the following stack trace&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[10146.583339] Lustre: DEBUG MARKER: == sanityn test 16a: 500 iterations of dual-mount fsx ================================================ 03:55:24 (1524196524)
[10252.159310] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [ll_ost_io00_003:23498]
[10252.159310] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core dm_mod zfs(POE) zunicode(POE) zavl(POE) icp(POE) iosf_mbi crc32_pclmul ghash_clmulni_intel zcommon(POE) znvpair(POE) ppdev spl(OE) aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev virtio_balloon pcspkr i2c_piix4 parport_pc parport nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix virtio_blk cirrus drm_kms_helper libata syscopyarea sysfillrect 8139too sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common drm crc32c_intel serio_raw virtio_pci virtio_ring virtio 8139cp i2c_core mii floppy
[10252.159310] CPU: 1 PID: 23498 Comm: ll_ost_io00_003 Tainted: P           OE  ------------   3.10.0-693.21.1.el7_lustre.x86_64 #1
[10252.159310] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[10252.159310] task: ffff880059e56eb0 ti: ffff880059538000 task.ti: ffff880059538000
[10252.159310] RIP: 0010:[&amp;lt;ffffffff81337f93&amp;gt;]  [&amp;lt;ffffffff81337f93&amp;gt;] memset+0x33/0xb0
[10252.159310] RSP: 0018:ffff88005953b820  EFLAGS: 00010212
[10252.159310] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000fbf
[10252.159310] RDX: 0000000000100000 RSI: 0000000000000000 RDI: ffffc900140ec000
[10252.159310] RBP: ffff88005953b8c0 R08: ffff880036aab690 R09: 0000000000000000
[10252.159310] R10: ffffc9001402b000 R11: ffffea000116bc00 R12: ffffffffc04a7d3b
[10252.159310] R13: ffff88005953b7b0 R14: ffff880059a5b000 R15: ffffffffc056e667
[10252.159310] FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[10252.159310] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10252.159310] CR2: 00007fd2559ae000 CR3: 0000000079462000 CR4: 00000000000606e0
[10252.159310] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[10252.159310] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[10252.159310] Call Trace:
[10252.159310]  [&amp;lt;ffffffffc04b2cc7&amp;gt;] ? dbuf_read+0x397/0x9e0 [zfs]
[10252.159310]  [&amp;lt;ffffffffc0575128&amp;gt;] ? zio_done+0x748/0xd20 [zfs]
[10252.159310]  [&amp;lt;ffffffffc056eb8c&amp;gt;] ? zio_destroy+0x7c/0x80 [zfs]
[10252.159310]  [&amp;lt;ffffffffc04b66d9&amp;gt;] dmu_buf_will_dirty+0x119/0x130 [zfs]
[10252.159310]  [&amp;lt;ffffffffc04bcf15&amp;gt;] dmu_write_impl+0x45/0xd0 [zfs]
[10252.159310]  [&amp;lt;ffffffffc04beb57&amp;gt;] dmu_write.part.7+0xa7/0x110 [zfs]
[10252.159310]  [&amp;lt;ffffffffc04bed36&amp;gt;] dmu_assign_arcbuf+0x156/0x1a0 [zfs]
[10252.159310]  [&amp;lt;ffffffffc10cfdcd&amp;gt;] osd_write_commit+0x46d/0xa00 [osd_zfs]
[10252.159310]  [&amp;lt;ffffffffc120b29a&amp;gt;] ofd_commitrw_write+0xf9a/0x1d00 [ofd]
[10252.159310]  [&amp;lt;ffffffffc120f112&amp;gt;] ofd_commitrw+0x4b2/0xa10 [ofd]
[10252.159310]  [&amp;lt;ffffffffc0c36c39&amp;gt;] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[10252.159310]  [&amp;lt;ffffffffc0ef8430&amp;gt;] tgt_brw_write+0x1180/0x1d70 [ptlrpc]
[10252.159310] INFO: rcu_sched detected stalls on CPUs/tasks:
[10252.159310]  [&amp;lt;ffffffffc0e4b940&amp;gt;] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[10252.159310]  [&amp;lt;ffffffffc0ef994a&amp;gt;] tgt_request_handle+0x92a/0x13b0 [ptlrpc]
[10252.159310]  [&amp;lt;ffffffffc0e9da53&amp;gt;] ptlrpc_server_handle_request+0x253/0xab0 [ptlrpc]
[10252.159310]  [&amp;lt;ffffffff810bdc4b&amp;gt;] ? __wake_up_common+0x5b/0x90
[10252.159310]  [&amp;lt;ffffffffc0ea1202&amp;gt;] ptlrpc_main+0xa92/0x1f40 [ptlrpc]
[10252.159310]  [&amp;lt;ffffffffc0ea0770&amp;gt;] ? ptlrpc_register_service+0xe90/0xe90 [ptlrpc]
[10252.159310]  [&amp;lt;ffffffff810b4031&amp;gt;] kthread+0xd1/0xe0
[10252.159310]  [&amp;lt;ffffffff810b3f60&amp;gt;] ? insert_kthread_work+0x40/0x40
[10252.159310]  [&amp;lt;ffffffff816c0577&amp;gt;] ret_from_fork+0x77/0xb0
[10252.159310]  [&amp;lt;ffffffff810b3f60&amp;gt;] ? insert_kthread_work+0x40/0x40
[10252.159310] Code: b8 01 01 01 01 01 01 01 01 48 0f af c1 41 89 f9 41 83 e1 07 75 70 48 89 d1 48 c1 e9 06 74 39 66 0f 1f 84 00 00 00 00 00 48 ff c9 &amp;lt;48&amp;gt; 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 89 
&#8230;

[10274.555013] } (detected by 0, t=60299 jiffies, g=942723, c=942722, q=34)
[10274.555013] Task dump for CPU 1:
[10274.555013] ll_ost_io00_003 R  running task        0 23498      2 0x00000088
[10274.555013] Call Trace:
[10274.555013]  [&amp;lt;ffffffffc04bed36&amp;gt;] ? dmu_assign_arcbuf+0x156/0x1a0 [zfs]
[10274.555013]  [&amp;lt;ffffffffc10cfdcd&amp;gt;] ? osd_write_commit+0x46d/0xa00 [osd_zfs]
[10274.555013]  [&amp;lt;ffffffffc120b29a&amp;gt;] ? ofd_commitrw_write+0xf9a/0x1d00 [ofd]
[10274.555013]  [&amp;lt;ffffffffc120f112&amp;gt;] ? ofd_commitrw+0x4b2/0xa10 [ofd]
[10274.555013]  [&amp;lt;ffffffffc0c36c39&amp;gt;] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[10274.555013]  [&amp;lt;ffffffffc0ef8430&amp;gt;] ? tgt_brw_write+0x1180/0x1d70 [ptlrpc]
[10274.555013]  [&amp;lt;ffffffffc0e4b940&amp;gt;] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[10274.555013]  [&amp;lt;ffffffffc0ef994a&amp;gt;] ? tgt_request_handle+0x92a/0x13b0 [ptlrpc]
[10274.555013]  [&amp;lt;ffffffffc0e9da53&amp;gt;] ? ptlrpc_server_handle_request+0x253/0xab0 [ptlrpc]
[10274.555013]  [&amp;lt;ffffffff810bdc4b&amp;gt;] ? __wake_up_common+0x5b/0x90
[10274.555013]  [&amp;lt;ffffffffc0ea1202&amp;gt;] ? ptlrpc_main+0xa92/0x1f40 [ptlrpc]
[10274.555013]  [&amp;lt;ffffffffc0ea0770&amp;gt;] ? ptlrpc_register_service+0xe90/0xe90 [ptlrpc]
[10274.555013]  [&amp;lt;ffffffff810b4031&amp;gt;] ? kthread+0xd1/0xe0
[10274.555013]  [&amp;lt;ffffffff810b3f60&amp;gt;] ? insert_kthread_work+0x40/0x40
[10274.555013]  [&amp;lt;ffffffff816c0577&amp;gt;] ? ret_from_fork+0x77/0xb0
[10274.555013]  [&amp;lt;ffffffff810b3f60&amp;gt;] ? insert_kthread_work+0x40/0x40
[10304.489700] WARNING: MMP writes to pool &apos;lustre-ost3&apos; have not succeeded in over 54s; suspending pool
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We&#8217;ve only seen this failure once. The failure seen is for review-dne-zfs. Logs are at&lt;br/&gt;
&lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/614c9d28-444f-11e8-95c0-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/614c9d28-444f-11e8-95c0-52540065bddc&lt;/a&gt;&lt;/p&gt;

</description>
                <environment>DNE/ZFS</environment>
        <key id="51925">LU-10935</key>
            <summary>sanityn test 16a hangs with &#8216;NMI watchdog: BUG: soft lockup&#8217;</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="jamesanunez">James Nunez</reporter>
                        <labels>
                            <label>DNE</label>
                            <label>zfs</label>
                    </labels>
                <created>Fri, 20 Apr 2018 17:40:17 +0000</created>
                <updated>Sat, 15 Dec 2018 18:06:27 +0000</updated>
                                            <version>Lustre 2.12.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>1</watches>
                                                                                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzzw53:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>