<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:11:30 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-7739] replay-single test 70b hangs with LBUG &apos;(mgc_request.c:995:mgc_blocking_ast()) ASSERTION( atomic_read(&amp;cld-&gt;cld_refcount) &gt; 0 )&apos;</title>
                <link>https://jira.whamcloud.com/browse/LU-7739</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;replay-single test_70b times out. In the MDS 2, MDS 3, MDS 4 console log, we see:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;13:15:37:LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. quota=on. Opts: 
13:15:37:LustreError: 25429:0:(mgc_request.c:995:mgc_blocking_ast()) ASSERTION( atomic_read(&amp;amp;cld-&amp;gt;cld_refcount) &amp;gt; 0 ) failed: 
13:15:37:LustreError: 25429:0:(mgc_request.c:995:mgc_blocking_ast()) LBUG
13:15:37:Pid: 25429, comm: ldlm_bl_01
13:15:37:
13:15:37:Call Trace:
13:15:37: [&amp;lt;ffffffffa0467875&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
13:15:37: [&amp;lt;ffffffffa0467e77&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
13:15:37: [&amp;lt;ffffffffa0cff9d9&amp;gt;] mgc_blocking_ast+0x6e9/0x810 [mgc]
13:15:37: [&amp;lt;ffffffffa0758b57&amp;gt;] ldlm_cancel_callback+0x87/0x280 [ptlrpc]
13:15:37: [&amp;lt;ffffffffa07779ba&amp;gt;] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
13:15:37: [&amp;lt;ffffffffa077c55c&amp;gt;] ldlm_cli_cancel+0x9c/0x3e0 [ptlrpc]
13:15:37: [&amp;lt;ffffffffa0cff3db&amp;gt;] mgc_blocking_ast+0xeb/0x810 [mgc]
13:15:37: [&amp;lt;ffffffffa0cff2f0&amp;gt;] ? mgc_blocking_ast+0x0/0x810 [mgc]
13:15:37: [&amp;lt;ffffffffa0780c90&amp;gt;] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
13:15:37: [&amp;lt;ffffffffa0781ba1&amp;gt;] ldlm_bl_thread_main+0x481/0x710 [ptlrpc]
13:15:37: [&amp;lt;ffffffff810672b0&amp;gt;] ? default_wake_function+0x0/0x20
13:15:37: [&amp;lt;ffffffffa0781720&amp;gt;] ? ldlm_bl_thread_main+0x0/0x710 [ptlrpc]
13:15:37: [&amp;lt;ffffffff810a0fce&amp;gt;] kthread+0x9e/0xc0
13:15:37: [&amp;lt;ffffffff8100c28a&amp;gt;] child_rip+0xa/0x20
13:15:37: [&amp;lt;ffffffff810a0f30&amp;gt;] ? kthread+0x0/0xc0
13:15:37: [&amp;lt;ffffffff8100c280&amp;gt;] ? child_rip+0x0/0x20
13:15:37:
13:15:37:Kernel panic - not syncing: LBUG
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the past month, I can only find two occurrences of this error for test_70b. Logs at&lt;br/&gt;
2016-01-28 15:21:30 - &lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/c296d92c-c620-11e5-b4e1-5254006e85c2&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/c296d92c-c620-11e5-b4e1-5254006e85c2&lt;/a&gt;&lt;br/&gt;
2016-02-03 19:34:24 - &lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/e4674cb8-caf7-11e5-be8d-5254006e85c2&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/e4674cb8-caf7-11e5-be8d-5254006e85c2&lt;/a&gt;&lt;/p&gt;</description>
                <environment>autotest review-dne-part-2</environment>
        <key id="34490">LU-7739</key>
            <summary>replay-single test 70b hangs with LBUG &apos;(mgc_request.c:995:mgc_blocking_ast()) ASSERTION( atomic_read(&amp;cld-&gt;cld_refcount) &gt; 0 )&apos;</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="jamesanunez">James Nunez</reporter>
                        <labels>
                    </labels>
                <created>Thu, 4 Feb 2016 16:29:16 +0000</created>
                <updated>Tue, 14 Dec 2021 22:19:07 +0000</updated>
                            <resolved>Tue, 14 Dec 2021 22:19:07 +0000</resolved>
                                    <version>Lustre 2.8.0</version>
                    <version>Lustre 2.9.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="143359" author="jamesanunez" created="Tue, 23 Feb 2016 16:21:38 +0000"  >&lt;p&gt;A similar error on sanity test_133g on unmount of MDS4. The LBUG/ASSERT is the same as above, but the stack trace differs&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;13:44:11:Lustre: DEBUG MARKER: umount -d -f /mnt/mds4
13:44:11:LustreError: 5639:0:(osp_object.c:588:osp_attr_get()) lustre-MDT0000-osp-MDT0003:osp_attr_get update error [0x200000403:0x1:0x0]: rc = -5
13:44:11:LustreError: 5639:0:(osp_object.c:588:osp_attr_get()) Skipped 2 previous similar messages
13:44:11:LustreError: 5639:0:(mgc_request.c:995:mgc_blocking_ast()) ASSERTION( atomic_read(&amp;amp;cld-&amp;gt;cld_refcount) &amp;gt; 0 ) failed: 
13:44:11:LustreError: 5639:0:(mgc_request.c:995:mgc_blocking_ast()) LBUG
13:44:11:Pid: 5639, comm: umount
13:44:11:
13:44:11:Call Trace:
13:44:11: [&amp;lt;ffffffffa0467875&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
13:44:11: [&amp;lt;ffffffffa0467e77&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
13:44:11: [&amp;lt;ffffffffa0cc48b9&amp;gt;] mgc_blocking_ast+0x6e9/0x810 [mgc]
13:44:11: [&amp;lt;ffffffffa074ba87&amp;gt;] ldlm_cancel_callback+0x87/0x280 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa0472cf1&amp;gt;] ? libcfs_debug_msg+0x41/0x50 [libcfs]
13:44:11: [&amp;lt;ffffffffa07698da&amp;gt;] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa076ccae&amp;gt;] ldlm_cli_cancel_list_local+0xee/0x290 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa076d011&amp;gt;] ldlm_cancel_resource_local+0x1c1/0x290 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa076d177&amp;gt;] ldlm_cli_cancel_unused_resource+0x97/0x280 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa076d545&amp;gt;] ldlm_cli_hash_cancel_unused+0x35/0x40 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa04787cb&amp;gt;] cfs_hash_for_each_relax+0x1eb/0x350 [libcfs]
13:44:11: [&amp;lt;ffffffffa076d510&amp;gt;] ? ldlm_cli_hash_cancel_unused+0x0/0x40 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa076d510&amp;gt;] ? ldlm_cli_hash_cancel_unused+0x0/0x40 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa076d510&amp;gt;] ? ldlm_cli_hash_cancel_unused+0x0/0x40 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa047a6ac&amp;gt;] cfs_hash_for_each_nolock+0x8c/0x1d0 [libcfs]
13:44:11: [&amp;lt;ffffffffa076d4c6&amp;gt;] ldlm_cli_cancel_unused+0x166/0x1b0 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa075ae98&amp;gt;] client_disconnect_export+0x228/0x460 [ptlrpc]
13:44:11: [&amp;lt;ffffffffa057af3a&amp;gt;] lustre_common_put_super+0x28a/0xb00 [obdclass]
13:44:11: [&amp;lt;ffffffffa05a0246&amp;gt;] server_put_super+0x116/0xcd0 [obdclass]
13:44:11: [&amp;lt;ffffffff811946eb&amp;gt;] generic_shutdown_super+0x5b/0xe0
13:44:11: [&amp;lt;ffffffff811947d6&amp;gt;] kill_anon_super+0x16/0x60
13:44:11: [&amp;lt;ffffffffa0572616&amp;gt;] lustre_kill_super+0x36/0x60 [obdclass]
13:44:11: [&amp;lt;ffffffff81194f77&amp;gt;] deactivate_super+0x57/0x80
13:44:11: [&amp;lt;ffffffff811b4f5f&amp;gt;] mntput_no_expire+0xbf/0x110
13:44:11: [&amp;lt;ffffffff811b5aab&amp;gt;] sys_umount+0x7b/0x3a0
13:44:11: [&amp;lt;ffffffff8100b0d2&amp;gt;] system_call_fastpath+0x16/0x1b
13:44:11:
13:44:11:Kernel panic - not syncing: LBUG
13:44:11:Pid: 5639, comm: umount Not tainted 2.6.32-573.18.1.el6_lustre.g55ae312.x86_64 #1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Log are at&lt;br/&gt;
2016-02-22 19:06:02 - &lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/102b46ac-d9e8-11e5-9e9f-5254006e85c2&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/102b46ac-d9e8-11e5-9e9f-5254006e85c2&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="149320" author="yujian" created="Mon, 18 Apr 2016 18:52:12 +0000"  >&lt;p&gt;Lustre Branch: master&lt;br/&gt;
Test Group: review-dne-part-2&lt;/p&gt;

&lt;p&gt;insanity test 11 hit the same failure on MDS 2:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Lustre: DEBUG MARKER: mkdir -p /mnt/mds2; mount -t lustre                                  /dev/lvm-Role_MDS/P2 /mnt/mds2
LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. Opts: 
LustreError: 31250:0:(mgc_request.c:995:mgc_blocking_ast()) ASSERTION( atomic_read(&amp;amp;cld-&amp;gt;cld_refcount) &amp;gt; 0 ) failed:  
LustreError: 31250:0:(mgc_request.c:995:mgc_blocking_ast()) LBUG
Pid: 31250, comm: ldlm_bl_01

Call Trace:
 [&amp;lt;ffffffffa0464875&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 
 [&amp;lt;ffffffffa0464e77&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs] 
 [&amp;lt;ffffffffa044b8b9&amp;gt;] mgc_blocking_ast+0x6e9/0x810 [mgc]
 [&amp;lt;ffffffffa075ea87&amp;gt;] ldlm_cancel_callback+0x87/0x280 [ptlrpc] 
 [&amp;lt;ffffffffa077c4da&amp;gt;] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc] 
 [&amp;lt;ffffffffa078107c&amp;gt;] ldlm_cli_cancel+0x9c/0x3e0 [ptlrpc] 
 [&amp;lt;ffffffffa044b2bb&amp;gt;] mgc_blocking_ast+0xeb/0x810 [mgc]
 [&amp;lt;ffffffffa044b1d0&amp;gt;] ? mgc_blocking_ast+0x0/0x810 [mgc]
 [&amp;lt;ffffffffa0785400&amp;gt;] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc] 
 [&amp;lt;ffffffffa0786291&amp;gt;] ldlm_bl_thread_main+0x461/0x650 [ptlrpc] 
 [&amp;lt;ffffffff81067670&amp;gt;] ? default_wake_function+0x0/0x20
 [&amp;lt;ffffffffa0785e30&amp;gt;] ? ldlm_bl_thread_main+0x0/0x650 [ptlrpc] 
 [&amp;lt;ffffffff810a138e&amp;gt;] kthread+0x9e/0xc0
 [&amp;lt;ffffffff8100c28a&amp;gt;] child_rip+0xa/0x20
 [&amp;lt;ffffffff810a12f0&amp;gt;] ? kthread+0x0/0xc0
 [&amp;lt;ffffffff8100c280&amp;gt;] ? child_rip+0x0/0x20

Kernel panic - not syncing: LBUG
Pid: 31250, comm: ldlm_bl_01 Not tainted 2.6.32-573.22.1.el6_lustre.x86_64 #1
Call Trace:
 [&amp;lt;ffffffff815394d1&amp;gt;] ? panic+0xa7/0x16f
 [&amp;lt;ffffffffa0464ecb&amp;gt;] ? lbug_with_loc+0x9b/0xb0 [libcfs] 
 [&amp;lt;ffffffffa044b8b9&amp;gt;] ? mgc_blocking_ast+0x6e9/0x810 [mgc]
 [&amp;lt;ffffffffa075ea87&amp;gt;] ? ldlm_cancel_callback+0x87/0x280 [ptlrpc] 
 [&amp;lt;ffffffffa077c4da&amp;gt;] ? ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc] 
 [&amp;lt;ffffffffa078107c&amp;gt;] ? ldlm_cli_cancel+0x9c/0x3e0 [ptlrpc] 
 [&amp;lt;ffffffffa044b2bb&amp;gt;] ? mgc_blocking_ast+0xeb/0x810 [mgc]
 [&amp;lt;ffffffffa044b1d0&amp;gt;] ? mgc_blocking_ast+0x0/0x810 [mgc]
 [&amp;lt;ffffffffa0785400&amp;gt;] ? ldlm_handle_bl_callback+0x130/0x400 [ptlrpc] 
 [&amp;lt;ffffffffa0786291&amp;gt;] ? ldlm_bl_thread_main+0x461/0x650 [ptlrpc] 
 [&amp;lt;ffffffff81067670&amp;gt;] ? default_wake_function+0x0/0x20
 [&amp;lt;ffffffffa0785e30&amp;gt;] ? ldlm_bl_thread_main+0x0/0x650 [ptlrpc] 
 [&amp;lt;ffffffff810a138e&amp;gt;] ? kthread+0x9e/0xc0
 [&amp;lt;ffffffff8100c28a&amp;gt;] ? child_rip+0xa/0x20
 [&amp;lt;ffffffff810a12f0&amp;gt;] ? kthread+0x0/0xc0
 [&amp;lt;ffffffff8100c280&amp;gt;] ? child_rip+0x0/0x20
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/595a0f92-030b-11e6-b5f1-5254006e85c2&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/595a0f92-030b-11e6-b5f1-5254006e85c2&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="172293" author="ake_s" created="Fri, 4 Nov 2016 09:55:30 +0000"  >&lt;p&gt;We&apos;re sometimes seeing a crash almost identical to the one from James Nunez, at umount.&lt;br/&gt;
Using tag 2.8.56 + the fixes for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6808&quot; title=&quot;Interop 2.5.3&amp;lt;-&amp;gt;master sanity test_224c: Bulk IO write error&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6808&quot;&gt;&lt;del&gt;LU-6808&lt;/del&gt;&lt;/a&gt; on the servers, 2.5.41_DDN on the servers&lt;/p&gt;

&lt;p&gt;This is on a production system.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="32646">LU-7298</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="31074">LU-6844</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzy0b3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>