<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:35:11 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-17406] sanity-flr test_50A: watchdog: BUG: soft lockup - CPU#0 stuck for 22s</title>
                <link>https://jira.whamcloud.com/browse/LU-17406</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;This issue was created by maloo for Andreas Dilger &amp;lt;adilger@whamcloud.com&amp;gt;&lt;/p&gt;

&lt;p&gt;This issue relates to the following test suite run:&lt;br/&gt;
&lt;a href=&quot;https://testing.whamcloud.com/test_sets/112570ae-2e64-4c60-bd13-b1447c7934fa&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/112570ae-2e64-4c60-bd13-b1447c7934fa&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;test_50A failed with the following error after both CPUs were locked up:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;onyx-99vm1 crash during sanity-flr test_50A
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Test session details:&lt;br/&gt;
clients: &lt;a href=&quot;https://build.whamcloud.com/job/lustre-reviews/101181&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://build.whamcloud.com/job/lustre-reviews/101181&lt;/a&gt; - 4.18.0-477.27.1.el8_8.x86_64&lt;br/&gt;
servers: &lt;a href=&quot;https://build.whamcloud.com/job/lustre-reviews/101181&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://build.whamcloud.com/job/lustre-reviews/101181&lt;/a&gt; - 4.18.0-477.27.1.el8_lustre.x86_64&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt; Lustre: DEBUG MARKER: == sanity-flr test 50A: mirror split update layout generation ===== 19:25:25 (1704741925)
 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1&apos; &apos; /proc/mounts || true
 Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
 Lustre: Failing over lustre-MDT0000
 watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [ldlm_bl_02:77462]
 watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [ldlm_bl_03:80014]
 CPU: 1 PID: 80014 Comm: ldlm_bl_03 4.18.0-477.27.1.el8_lustre.x86_64 #1
 CPU: 0 PID: 77462 Comm: ldlm_bl_02 4.18.0-477.27.1.el8_lustre.x86_64 #1
 RIP: 0010:cfs_hash_for_each_relax+0x17b/0x480 [libcfs]
 Call Trace:
  kvm_wait+0x58/0x60
  __pv_queued_spin_lock_slowpath+0x268/0x2a0
  cfs_hash_for_each_nolock+0x126/0x1f0 [libcfs]
  ldlm_reprocess_recovery_done+0x8b/0x100 [ptlrpc]
  _raw_spin_lock+0x1e/0x30
  cfs_hash_for_each_relax+0x14a/0x480 [libcfs]
  cfs_hash_for_each_nolock+0x126/0x1f0 [libcfs]
  ldlm_reprocess_recovery_done+0x8b/0x100 [ptlrpc]
  ldlm_export_cancel_locks+0x172/0x180 [ptlrpc]
  ldlm_export_cancel_locks+0x172/0x180 [ptlrpc]
  ldlm_bl_thread_main+0x6df/0x940 [ptlrpc]
  ldlm_bl_thread_main+0x6df/0x940 [ptlrpc]
  kthread+0x134/0x150
  kthread+0x134/0x150
  ret_from_fork+0x35/0x40
  ret_from_fork+0x35/0x40
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The duplicate lines in the stack trace look like they are because both CPUs are printing to the console at the same time and both appear to be in &lt;tt&gt;ldlm_export_cancel_locks()&lt;/tt&gt; and contending on the same spinlock.&lt;/p&gt;

&lt;p&gt;This similar stack also appeared in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17349&quot; title=&quot;sanity-quota test_81: Kernel panic - not syncing: softlockup: hung tasks&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17349&quot;&gt;&lt;del&gt;LU-17349&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;







&lt;p&gt;VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV&lt;br/&gt;
sanity-flr test_50A - onyx-99vm1 crashed during sanity-flr test_50A&lt;/p&gt;</description>
                <environment></environment>
        <key id="79885">LU-17406</key>
            <summary>sanity-flr test_50A: watchdog: BUG: soft lockup - CPU#0 stuck for 22s</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="maloo">Maloo</reporter>
                        <labels>
                    </labels>
                <created>Tue, 9 Jan 2024 01:08:51 +0000</created>
                <updated>Tue, 9 Jan 2024 01:15:20 +0000</updated>
                                            <version>Lustre 2.16.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="398911" author="adilger" created="Tue, 9 Jan 2024 01:15:12 +0000"  >&lt;p&gt;I think that there shouldn&apos;t be more than one thread evicting a client at once, so there should be some kind of flag on the export that puts the other thread to sleep (or it just returns) while the first thread cancels all of the locks.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="79489">LU-17349</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="79489">LU-17349</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i046tj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>