<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:17:07 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1493] assertion in dqacq_completion() (count &lt; *hardlimit) failed</title>
                <link>https://jira.whamcloud.com/browse/LU-1493</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;System did assert with following trace :&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 12908:0:(quota_context.c:683:dqacq_completion()) ASSERTION(count &amp;lt; *hardlimit) failed: id(10912) flag(22) type(u) isblk(b) count(134217728) qd_qunit(134217728) hardlimit(131072).
LustreError: 12908:0:(quota_context.c:683:dqacq_completion()) LBUG
Pid: 12908, comm: mdt_01

Call Trace:
 [&amp;lt;ffffffffa04df855&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [&amp;lt;ffffffffa04dfe95&amp;gt;] lbug_with_loc+0x75/0xe0 [libcfs]
 [&amp;lt;ffffffffa07d6d06&amp;gt;] dqacq_completion+0x1606/0x1610 [lquota]
 [&amp;lt;ffffffffa07eb84d&amp;gt;] ? quota_search_lqs+0x9d/0x5f0 [lquota]
 [&amp;lt;ffffffffa04e09ae&amp;gt;] ? cfs_free+0xe/0x10 [libcfs]
 [&amp;lt;ffffffff81085bf0&amp;gt;] ? getnstimeofday+0x60/0xf0
 [&amp;lt;ffffffffa07d4738&amp;gt;] schedule_dqacq+0xa08/0x19d0 [lquota]
 [&amp;lt;ffffffffa04e09ae&amp;gt;] ? cfs_free+0xe/0x10 [libcfs]
 [&amp;lt;ffffffffa07d15e8&amp;gt;] ? check_cur_qunit+0x448/0xb70 [lquota]
 [&amp;lt;ffffffffa07dc8ca&amp;gt;] ? quota_is_set+0x6a/0x2f0 [lquota]
 [&amp;lt;ffffffffa07d79c9&amp;gt;] qctxt_adjust_qunit+0x109/0x350 [lquota]
 [&amp;lt;ffffffff81085cea&amp;gt;] ? do_gettimeofday+0x1a/0x50
 [&amp;lt;ffffffffa07e299d&amp;gt;] mds_quota_adjust+0x2ad/0x3b0 [lquota]
 [&amp;lt;ffffffffa09f2001&amp;gt;] ? mdd_lov_create_finish+0x61/0xd0 [mdd]
 [&amp;lt;ffffffffa0a07982&amp;gt;] mdd_create+0x6c2/0x1db0 [mdd]
 [&amp;lt;ffffffffa04ee649&amp;gt;] ? cfs_hash_bd_add_locked+0x29/0x90 [libcfs]
 [&amp;lt;ffffffffa05a96de&amp;gt;] ? lu_object_find_at+0x3fe/0x770 [obdclass]
 [&amp;lt;ffffffffa09e5f08&amp;gt;] ? mdd_version_get+0x68/0xa0 [mdd]
 [&amp;lt;ffffffffa0a902bc&amp;gt;] cml_create+0xbc/0x280 [cmm]
 [&amp;lt;ffffffffa0a4d746&amp;gt;] ? mdt_version_save+0x96/0x170 [mdt]
 [&amp;lt;ffffffffa0a65747&amp;gt;] mdt_reint_open+0x1f67/0x2d90 [mdt]
 [&amp;lt;ffffffff81003ace&amp;gt;] ? common_interrupt+0xe/0x13
 [&amp;lt;ffffffffa0a0e586&amp;gt;] ? md_ucred+0x26/0x60 [mdd]
 [&amp;lt;ffffffffa0a305f5&amp;gt;] ? mdt_ucred+0x15/0x20 [mdt]
 [&amp;lt;ffffffffa0a4786f&amp;gt;] ? mdt_root_squash+0x2f/0x450 [mdt]
 [&amp;lt;ffffffffa0a4cabf&amp;gt;] mdt_reint_rec+0x3f/0x100 [mdt]
 [&amp;lt;ffffffffa069cd74&amp;gt;] ? lustre_msg_get_flags+0x34/0xa0 [ptlrpc]
 [&amp;lt;ffffffffa0a44f64&amp;gt;] mdt_reint_internal+0x6d4/0x9f0 [mdt]
 [&amp;lt;ffffffffa0a32cde&amp;gt;] ? mdt_intent_fixup_resent+0x4e/0x270 [mdt]
 [&amp;lt;ffffffffa0a455e5&amp;gt;] mdt_intent_reint+0x245/0x600 [mdt]
 [&amp;lt;ffffffffa04ef615&amp;gt;] ? cfs_hash_bd_lookup_intent+0xe5/0x130 [libcfs]
 [&amp;lt;ffffffffa069e170&amp;gt;] ? lustre_swab_ldlm_intent+0x0/0x20 [ptlrpc]
 [&amp;lt;ffffffffa0a3d630&amp;gt;] mdt_intent_policy+0x3c0/0x6b0 [mdt]
 [&amp;lt;ffffffff810f18c6&amp;gt;] ? __perf_event_task_sched_out+0x36/0x50
 [&amp;lt;ffffffffa0587441&amp;gt;] ? class_handle_hash+0xa1/0x280 [obdclass]
 [&amp;lt;ffffffffa0655afa&amp;gt;] ldlm_lock_enqueue+0x2da/0xa50 [ptlrpc]
 [&amp;lt;ffffffffa0674495&amp;gt;] ? ldlm_export_lock_get+0x15/0x20 [ptlrpc]
 [&amp;lt;ffffffffa04ee682&amp;gt;] ? cfs_hash_bd_add_locked+0x62/0x90 [libcfs]
 [&amp;lt;ffffffffa067c577&amp;gt;] ldlm_handle_enqueue0+0x447/0x1090 [ptlrpc]
 [&amp;lt;ffffffffa0a313a1&amp;gt;] ? mdt_unpack_req_pack_rep+0x51/0x5d0 [mdt]
 [&amp;lt;ffffffffa0a3d0ca&amp;gt;] mdt_enqueue+0x4a/0x110 [mdt]
 [&amp;lt;ffffffffa0a37865&amp;gt;] mdt_handle_common+0x8d5/0x1810 [mdt]
 [&amp;lt;ffffffffa069a4f4&amp;gt;] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
 [&amp;lt;ffffffffa0a38875&amp;gt;] mdt_regular_handle+0x15/0x20 [mdt]
 [&amp;lt;ffffffffa06ab239&amp;gt;] ptlrpc_main+0xc79/0x19d0 [ptlrpc]
 [&amp;lt;ffffffff810017bc&amp;gt;] ? __switch_to+0x1ac/0x320
 [&amp;lt;ffffffffa06aa5c0&amp;gt;] ? ptlrpc_main+0x0/0x19d0 [ptlrpc]
 [&amp;lt;ffffffff810041aa&amp;gt;] child_rip+0xa/0x20
 [&amp;lt;ffffffffa06aa5c0&amp;gt;] ? ptlrpc_main+0x0/0x19d0 [ptlrpc]
 [&amp;lt;ffffffffa06aa5c0&amp;gt;] ? ptlrpc_main+0x0/0x19d0 [ptlrpc]
 [&amp;lt;ffffffff810041a0&amp;gt;] ? child_rip+0x0/0x20
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After the MDS went back to production it did failed again with the same trace, same assertion (with same parameters), but req-&amp;gt;rq_peer didn&apos;t reference the same client.&lt;/p&gt;

&lt;p&gt;Any idea ?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;</description>
                <environment></environment>
        <key id="14785">LU-1493</key>
            <summary>assertion in dqacq_completion() (count &lt; *hardlimit) failed</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="niu">Niu Yawei</assignee>
                                    <reporter username="louveta">Alexandre Louvet</reporter>
                        <labels>
                    </labels>
                <created>Thu, 7 Jun 2012 05:30:40 +0000</created>
                <updated>Wed, 25 Jul 2012 12:05:17 +0000</updated>
                            <resolved>Wed, 25 Jul 2012 12:05:17 +0000</resolved>
                                    <version>Lustre 2.1.1</version>
                                    <fixVersion>Lustre 2.3.0</fixVersion>
                    <fixVersion>Lustre 2.1.3</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="40184" author="pjones" created="Thu, 7 Jun 2012 08:47:37 +0000"  >&lt;p&gt;Niu&lt;/p&gt;

&lt;p&gt;Could you please comment on this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="40190" author="niu" created="Thu, 7 Jun 2012 10:28:48 +0000"  >&lt;p&gt;Hi, Alexandre, is this easy to be reproduced? Do you know what kind of operations caused this?&lt;/p&gt;</comment>
                            <comment id="40264" author="louveta" created="Fri, 8 Jun 2012 07:43:37 +0000"  >&lt;p&gt;Hi Niu,&lt;/p&gt;

&lt;p&gt;Unfortunately I doesn&apos;t have a reproducer nor an idea of what was running on the remote node at time of assert.&lt;br/&gt;
Since then, the system was rebooted and we did run fsck + quotacheck and we haven&apos;t see the issue again, but I don&apos;t know if the code that did create this issue has run again or not.&lt;/p&gt;

&lt;p&gt;Right now I don&apos;t have more data about this issue.&lt;/p&gt;

&lt;p&gt;Alex.&lt;/p&gt;</comment>
                            <comment id="40332" author="niu" created="Mon, 11 Jun 2012 07:00:28 +0000"  >&lt;p&gt;Looks there is a race in the quota code, which could cause unexpected quota release, and trigger the assert in dqacq_completion(). &lt;/p&gt;

&lt;p&gt;The patch for master: &lt;a href=&quot;http://review.whamcloud.com/3074&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/3074&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="42246" author="pjones" created="Wed, 25 Jul 2012 12:05:17 +0000"  >&lt;p&gt;Landed for 2.1.3 and 2.3&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzv6h3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4583</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>