<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:16:13 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15190] ptlrpc_server_check_resend_in_progress() can miss duplicate RPC</title>
                <link>https://jira.whamcloud.com/browse/LU-15190</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;ptlrpc_server_check_resend_in_progress() has the following check at the beginning:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (!(lustre_msg_get_flags(req-&amp;gt;rq_reqmsg) &amp;amp; MSG_RESENT) ||
            (atomic_read(&amp;amp;req-&amp;gt;rq_export-&amp;gt;exp_rpc_count) == 0))
                &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; NULL;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I think this can cause duplicate RPCs if none is in progress at the moment (due to high load, deep incoming queue).&lt;/p&gt;

&lt;p&gt;and there is a crash dump in support of this theory. in that dump I was able to find lots of duplicate (up to 14). for example, &lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
crash&amp;gt; p *(struct ptlrpc_request *)(0xffff9887c6c75ee0-0x60)
  rq_reqmsg = 0xffff9887c7f34000, 
  rq_xid = 1709012909603712, 
  rq_export = 0xffff988805ee5400, 
  rq_peer = {
    nid = 1407418002966021, 

crash&amp;gt; p *(struct ptlrpc_request *)(0xffff987eb906a8e0-0x60)
  rq_reqmsg = 0xffff987e7145c148, 
  rq_xid = 1709012909603712, 
  rq_export = 0xffff988805ee5400, 
  rq_peer = {
    nid = 1407418002966021, 

crash&amp;gt; ptlrpc_request_dump (0xffff98745d3a5a60-0x60)
req: 0xffff9875002d6520, xid: 2531069376, opc: 103, flags: 2, buf2: 0xffff9875002d6600/104
crash&amp;gt; ptlrpc_request_dump (0xffff98771dd06360-0x60)
req: 0xffff9884e2218520, xid: 2531069376, opc: 103, flags: 2, buf2: 0xffff9884e2218600/104
crash&amp;gt; ptlrpc_request_dump (0xffff9878a80c3ae0-0x60)
req: 0xffff9878ae8ae148, xid: 2531069376, opc: 103, flags: 2, buf2: 0xffff9878ae8ae228/104
crash&amp;gt; ptlrpc_request_dump (0xffff98789c049b60-0x60)
req: 0xffff98789c7403d8, xid: 2531069376, opc: 103, flags: 2, buf2: 0xffff98789c7404b8/104

crash&amp;gt; p ((struct ldlm_request *)0xffff9875002d6600)-&amp;gt;lock_handle
    cookie = 13969718594132579448
crash&amp;gt; p ((struct ldlm_request *)0xffff9884e2218600)-&amp;gt;lock_handle
    cookie = 13969718594132579448
crash&amp;gt; p ((struct ldlm_request *)0xffff9878ae8ae228)-&amp;gt;lock_handle
    cookie = 13969718594132579448
crash&amp;gt; p ((struct ldlm_request *)0xffff98789c7404b8)-&amp;gt;lock_handle
    cookie = 13969718594132579448
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;notice same XID and same lock&apos;s handle.&lt;/p&gt;

&lt;p&gt;dumped all RPCs from export&apos;s HP list and checked the XID&apos;s:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
$ cat xid-sorted-list.txt | wc -l
877858
$ cat xid-sorted-list.txt | uniq |wc -l
213480
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;i.e. 3/4 of all RPCs were duplicates.&lt;/p&gt;

&lt;p&gt;given ptlrpc_server_check_resend_in_progress() uses a linear scan to check for duplicates and a single spinlock, the check takes a lot and many CPUs were spinning for seconds.&lt;/p&gt;
</description>
                <environment></environment>
        <key id="66952">LU-15190</key>
            <summary>ptlrpc_server_check_resend_in_progress() can miss duplicate RPC</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bzzz">Alex Zhuravlev</assignee>
                                    <reporter username="bzzz">Alex Zhuravlev</reporter>
                        <labels>
                    </labels>
                <created>Wed, 3 Nov 2021 06:03:12 +0000</created>
                <updated>Thu, 5 May 2022 14:57:54 +0000</updated>
                            <resolved>Thu, 5 May 2022 14:57:54 +0000</resolved>
                                    <version>Upstream</version>
                                    <fixVersion>Lustre 2.15.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="317328" author="gerrit" created="Wed, 3 Nov 2021 06:33:56 +0000"  >&lt;p&gt;&quot;Alex Zhuravlev &amp;lt;bzzz@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/45445&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45445&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15190&quot; title=&quot;ptlrpc_server_check_resend_in_progress() can miss duplicate RPC&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15190&quot;&gt;&lt;del&gt;LU-15190&lt;/del&gt;&lt;/a&gt; ptlrpc: fix duplication check&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 177c1951d83e3efdbfc6cd63ca99c4d967898c0f&lt;/p&gt;</comment>
                            <comment id="317331" author="gerrit" created="Wed, 3 Nov 2021 08:10:35 +0000"  >&lt;p&gt;&quot;Alex Zhuravlev &amp;lt;bzzz@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/45446&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45446&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15190&quot; title=&quot;ptlrpc_server_check_resend_in_progress() can miss duplicate RPC&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15190&quot;&gt;&lt;del&gt;LU-15190&lt;/del&gt;&lt;/a&gt; ptlrpc: rhashtable for xid duplication check&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 4426b529a6dd44c3b6e00b400a8507d1728d8e39&lt;/p&gt;</comment>
                            <comment id="320720" author="gerrit" created="Mon, 13 Dec 2021 03:54:33 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/45445/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45445/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15190&quot; title=&quot;ptlrpc_server_check_resend_in_progress() can miss duplicate RPC&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15190&quot;&gt;&lt;del&gt;LU-15190&lt;/del&gt;&lt;/a&gt; ptlrpc: fix duplication check&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: bb83a8af59d30b3f9e6de171eca962316ab7f6f4&lt;/p&gt;</comment>
                            <comment id="333887" author="pjones" created="Thu, 5 May 2022 14:57:54 +0000"  >&lt;p&gt;Seems to be landed for 2.15&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i0293r:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>