<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:42:04 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-11229] server_bulk_callback()) ASSERTION( desc-&gt;bd_md_count &gt; 0 ) failed</title>
                <link>https://jira.whamcloud.com/browse/LU-11229</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;While testing an &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10428&quot; title=&quot;LNet events should generated without resource lock held&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10428&quot;&gt;&lt;del&gt;LU-10428&lt;/del&gt;&lt;/a&gt; I have seen a asserts, but Amir say me this issue don&apos;t limited to the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10428&quot; title=&quot;LNet events should generated without resource lock held&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10428&quot;&gt;&lt;del&gt;LU-10428&lt;/del&gt;&lt;/a&gt; and have seen on real OPA system. Collecting a logs point me to the long aged bug related to the 4mb IO landing. It have replace a bd_success with bd_failure and it introduce a second bug in this area (first one is client part of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11169&quot; title=&quot;Data corruption during IOR testing with network error simulation&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11169&quot;&gt;&lt;del&gt;LU-11169&lt;/del&gt;&lt;/a&gt;).&lt;br/&gt;
 problem is simple.&lt;br/&gt;
 server bulk created as generate a two events for a transfer&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
&lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; ptlrpc_start_bulk_transfer(struct ptlrpc_bulk_desc *desc)
..
&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&lt;span class=&quot;code-comment&quot;&gt;/* Network is about to get at the memory */&lt;/span&gt;
&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (ptlrpc_is_bulk_put_source(desc-&amp;gt;bd_type))
&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;rc = LNetPut(self_nid, desc-&amp;gt;bd_mds[posted_md],
&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194; LNET_ACK_REQ, peer_id,
&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194; desc-&amp;gt;bd_portal, mbits, 0, 0);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;So two lnet events per MD, but..&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
 void server_bulk_callback(struct lnet_event *ev)
{
...
             &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (ev-&amp;gt;unlinked) {
&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194; desc-&amp;gt;bd_md_count--;
&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&lt;span class=&quot;code-comment&quot;&gt;/* This is the last callback no matter what... */&lt;/span&gt;
&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (desc-&amp;gt;bd_md_count == 0)
&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;wake_up(&amp;amp;desc-&amp;gt;bd_waitq);
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;OOPS.. we have decrease a bd_md_count twice = one for LNET_SEND, second one is for LNET_ACK.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt; 00000100:00000010:0.0:1533747855.090799:0:24663:0:(client.c:130:ptlrpc_new_bulk()) kmalloced &apos;desc&apos;: 416 at ffff88006080c800.
 00000100:00000200:0.0:1533747855.091779:0:21701:0:(events.c:449:server_bulk_callback()) event type 5, status 0, desc ffff88006080c800
 00000100:00000200:1.0:1533747855.091788:0:21700:0:(events.c:449:server_bulk_callback()) event type 4, status 0, desc ffff88006080c800
 00000100:00040000:0.0:1533747855.091796:0:21701:0:(events.c:453:server_bulk_callback()) ASSERTION( desc-&amp;gt;bd_md_count &amp;gt; 0 ) failed:
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So looks we don&apos;t need to trust an ev-&amp;gt;unlinked (buffer is unlinked after send), but wait an ACK if it still needs.&lt;/p&gt;</description>
                <environment></environment>
        <key id="52934">LU-11229</key>
            <summary>server_bulk_callback()) ASSERTION( desc-&gt;bd_md_count &gt; 0 ) failed</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="shadow">Alexey Lyashkov</assignee>
                                    <reporter username="shadow">Alexey Lyashkov</reporter>
                        <labels>
                    </labels>
                <created>Thu, 9 Aug 2018 03:58:15 +0000</created>
                <updated>Thu, 9 Aug 2018 09:46:36 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="231689" author="shadow" created="Thu, 9 Aug 2018 04:38:06 +0000"  >&lt;p&gt;I not sure, why ACK is needs in this case. I think it just additional overhead if enabled correctly and server can able to handle a partial transfer for now.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i000hz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>