<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:53:45 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12569] IBLND_CREDITS_HIGHWATER does not check connection queue depth</title>
                <link>https://jira.whamcloud.com/browse/LU-12569</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The IBLND_CREDITS_HIGHWATER check is used to decide at what # of credits a NOOP message needs to be sent in order to return credits (this is for the case when we send many immediate messages, which consume credits but do not get an acknowledgment, so their credits are not automatically returned).&lt;/p&gt;

&lt;p&gt;However, the check uses a global tunable:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lnd_peercredits_hiw &lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This tunable is checked against the &lt;b&gt;global&lt;/b&gt; peer credits value to make sure it is sane (see&#160;kiblnd_tunables_setup()), which is to say that it is less than the total number of credits.&lt;/p&gt;

&lt;p&gt;However, individual connections can have a different queue depth than the &lt;b&gt;global&lt;/b&gt; setting (total credits for a connection is equal to the connection queue depth).&lt;/p&gt;

&lt;p&gt;That means if a connection queue depth differs (See particularly&#160;kiblnd_create_conn(), the &quot;queue depth reduced&quot; warning message for one case where this can happen) from the global value, it is possible for the highwater mark to be &lt;b&gt;higher&lt;/b&gt; than the total number of credits.&lt;/p&gt;

&lt;p&gt;In this case, no NOOP messages will be sent, and it is possible to stall out a connection if both sides send many immediate messages at once.&#160; Essentially, if both ends of a connection send enough immediate messages to exhaust credits, then neither side will send any more messages.&#160; The high water mark is supposed to prevent this by having them send a NOOP before they reach this state.&lt;/p&gt;

&lt;p&gt;But if the highwater mark is greater than the number of credits, this will not occur, and the connection will stall out until a ping or other event causes credits to be returned.&lt;/p&gt;

&lt;p&gt;The solution should be simple - The highwater mark check needs to take in to account the queue depth of an individual connection.&lt;/p&gt;</description>
                <environment></environment>
        <key id="56458">LU-12569</key>
            <summary>IBLND_CREDITS_HIGHWATER does not check connection queue depth</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="ashehata">Amir Shehata</assignee>
                                    <reporter username="pfarrell">Patrick Farrell</reporter>
                        <labels>
                    </labels>
                <created>Sun, 21 Jul 2019 17:02:57 +0000</created>
                <updated>Mon, 23 May 2022 05:44:56 +0000</updated>
                            <resolved>Fri, 20 Sep 2019 14:42:16 +0000</resolved>
                                                    <fixVersion>Lustre 2.13.0</fixVersion>
                    <fixVersion>Lustre 2.12.3</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="251781" author="gerrit" created="Sun, 21 Jul 2019 17:36:02 +0000"  >&lt;p&gt;Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/35578&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/35578&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12569&quot; title=&quot;IBLND_CREDITS_HIGHWATER does not check connection queue depth&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12569&quot;&gt;&lt;del&gt;LU-12569&lt;/del&gt;&lt;/a&gt; ko2iblnd: Make credits hiw connection aware&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: c7b414eab00bbff1210b28ddb108aee081682835&lt;/p&gt;</comment>
                            <comment id="251782" author="pfarrell" created="Sun, 21 Jul 2019 17:45:57 +0000"  >&lt;p&gt;It&apos;s &lt;b&gt;probably&lt;/b&gt; possible to write a test for this, but I&apos;m not really an IB guy.&lt;/p&gt;

&lt;p&gt;The tricky part is getting all the messages in flight at once.&lt;/p&gt;

&lt;p&gt;We could &lt;b&gt;probably&lt;/b&gt; delay receipt?&lt;br/&gt;
Like, just hang the thread responsible for the interrupt for a few seconds...?&lt;/p&gt;

&lt;p&gt;On both sides of the connection, though.&lt;/p&gt;

&lt;p&gt;So:&lt;br/&gt;
Just hang whatever function it is that runs the IRQ when a ko2ib message is received, so &quot;receipt&quot; doesn&apos;t occur for a few seconds.&lt;/p&gt;

&lt;p&gt;Then send a bunch of immediate messages from both sides, which will exhaust the credits.&lt;/p&gt;

&lt;p&gt;The other aspect of this is, if we want to trigger this specific issue, we have to engineer a reduced queue depth. That normally happens as part of this:&lt;br/&gt;
&quot; do &lt;/p&gt;
{
 init_qp_attr-&amp;gt;cap.max_send_wr = kiblnd_send_wrs(conn);
 init_qp_attr-&amp;gt;cap.max_recv_wr = IBLND_RECV_WRS(conn);

rc = rdma_create_qp(cmid, conn-&amp;gt;ibc_hdev-&amp;gt;ibh_pd, init_qp_attr);
 if (!rc || conn-&amp;gt;ibc_queue_depth &amp;lt; 2)
 break;

conn-&amp;gt;ibc_queue_depth--;
 }
&lt;p&gt; while (rc);&lt;br/&gt;
&quot;&lt;br/&gt;
Which is basically reducing max_send_wr and max_recv_wr each time until rdma_create_qp is successful.&lt;br/&gt;
(IBLND_RECV_WRS and kiblnd_send_wrs are functions of ibc_queue_depth)&lt;/p&gt;

&lt;p&gt;Or possibly just reducing queue depth as a hack would work - Just reduce it by using a fail_loc here,&lt;br/&gt;
since I think it should be valid/safe/etc to use a &lt;b&gt;smaller&lt;/b&gt; queue depth than what we asked the&lt;br/&gt;
hardware for. (This is just a guess, but it seems likely.)&lt;/p&gt;</comment>
                            <comment id="253217" author="pfarrell" created="Fri, 16 Aug 2019 19:51:55 +0000"  >&lt;p&gt;Amir is planning to rework this one as part of fixing &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10213&quot; title=&quot;o2iblnd: Potential discrepancy when allocating qp&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10213&quot;&gt;&lt;del&gt;LU-10213&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="255102" author="gerrit" created="Fri, 20 Sep 2019 07:55:02 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/35578/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/35578/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12569&quot; title=&quot;IBLND_CREDITS_HIGHWATER does not check connection queue depth&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12569&quot;&gt;&lt;del&gt;LU-12569&lt;/del&gt;&lt;/a&gt; o2iblnd: Make credits hiw connection aware&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 1b87e8f61781e48c31b4da647214d66addf2b90c&lt;/p&gt;</comment>
                            <comment id="255147" author="pjones" created="Fri, 20 Sep 2019 14:42:16 +0000"  >&lt;p&gt;Landed for 2.13&lt;/p&gt;</comment>
                            <comment id="255187" author="gerrit" created="Sat, 21 Sep 2019 21:17:56 +0000"  >&lt;p&gt;Minh Diep (mdiep@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36254&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36254&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12569&quot; title=&quot;IBLND_CREDITS_HIGHWATER does not check connection queue depth&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12569&quot;&gt;&lt;del&gt;LU-12569&lt;/del&gt;&lt;/a&gt; o2iblnd: Make credits hiw connection aware&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 08b218eca6fc01a468b050d6606034b293a8d727&lt;/p&gt;</comment>
                            <comment id="255543" author="gerrit" created="Sat, 28 Sep 2019 06:50:08 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/36254/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36254/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12569&quot; title=&quot;IBLND_CREDITS_HIGHWATER does not check connection queue depth&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12569&quot;&gt;&lt;del&gt;LU-12569&lt;/del&gt;&lt;/a&gt; o2iblnd: Make credits hiw connection aware&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 90ba471e367754ea6ddb9a95060591f46b95b0b6&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="70166">LU-15828</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00jz3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>