<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:48:40 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-5116] Race between resend and reply processing</title>
                <link>https://jira.whamcloud.com/browse/LU-5116</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Server evict client during invalid request&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:00100000:9.0:1400505646.197736:0:83755:0:(service.c:1734:ptlrpc_server_handle_req_in()) got req x1468534034672908
00000100:00020000:9.0:1400505646.197738:0:83755:0:(service.c:975:ptlrpc_check_req()) @@@ Invalid replay without recovery  req@ffff88079c2b0850 x1468534034672908/t0(88947828) o4-&amp;gt;7f3cf026-15bd-c61a-088c-a943e5bce2bf@335@gni1:0/0 lens 488/0 e 0 to 0 dl 0 ref 1 fl New:/6/ffffffff rc 0/-1
00000020:00080000:9.0:1400505646.221792:0:83755:0:(genops.c:1391:class_fail_export()) disconnecting export ffff8805acac6400/7f3cf026-15bd-c61a-088c-a943e5bce2bf
00000020:00000080:10.0:1400505646.221811:0:83755:0:(genops.c:1229:class_disconnect()) disconnect: cookie 0xa74fa39ba3a7cd61
00000020:00010000:10.0:1400505646.221817:0:83755:0:(genops.c:1746:obd_stale_export_put()) Put export ffff8805acac6400: total 1
00000100:00080000:10.0:1400505646.221820:0:83755:0:(import.c:1502:ptlrpc_cleanup_imp()) ffff88054809a800 ^W: changing import state from FULL to CLOSED
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;At the client side we can see a race&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:00080000:22.0:1400505646.246037:0:19252:0:(client.c:2487:ptlrpc_resend_req()) @@@ going to resend  req@ffff880ffea86000 x1468534034670388/t88947827(88947827) o4-&amp;gt;snx11063-OST0050-osc-ffff881039a22400@10.149.150.25@o2ib4008:6/4 lens 488/416 e 2 to 0 dl 1400505782 ref 2 fl Interpret:R/4/0 rc 0/0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Client going to resend request but it already has req-&amp;gt;rq_replied flag (Interpret:R), and req-&amp;gt;rq_reqmsg = MSG_REPLAY flag (/4).&lt;/p&gt;

&lt;p&gt;There was disconnect/reconnect at the client side (lnet error) and no recovery happened.&lt;/p&gt;

&lt;p&gt;The race exist between ptlrpc_check_set() and reconnect-&amp;gt;ptlrpc_resend_req. The request belong to the imp-&amp;gt;imp_sending_list and has MSG_REPLAY flag after after_reply() at ptlrpc_check_set() and before&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;                if (!cfs_list_empty(&amp;amp;req-&amp;gt;rq_list)) {
                        cfs_list_del_init(&amp;amp;req-&amp;gt;rq_list);
                        cfs_atomic_dec(&amp;amp;imp-&amp;gt;imp_inflight);                    
                }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The reconnect code process this list to resend request. So, it could happened that request got reply, after_reply() processed it, set MSG_REPLAY. But ptlrpc_resend_req() set rq_resend flag, and request going to resend. After such request with MSG_REPLAY flag come to server, it cause client eviction.&lt;/p&gt;</description>
                <environment></environment>
        <key id="24845">LU-5116</key>
            <summary>Race between resend and reply processing</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="aboyko">Alexander Boyko</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Wed, 28 May 2014 18:04:28 +0000</created>
                <updated>Wed, 3 Sep 2014 14:22:44 +0000</updated>
                            <resolved>Mon, 2 Jun 2014 18:03:43 +0000</resolved>
                                    <version>Lustre 2.4.1</version>
                    <version>Lustre 2.5.0</version>
                    <version>Lustre 2.6.0</version>
                                    <fixVersion>Lustre 2.6.0</fixVersion>
                    <fixVersion>Lustre 2.5.2</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="85045" author="aboyko" created="Wed, 28 May 2014 18:08:33 +0000"  >&lt;p&gt;patch &lt;a href=&quot;http://review.whamcloud.com/10471&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10471&lt;/a&gt;&lt;br/&gt;
Xyratex-bug-id: MRP-1888&lt;/p&gt;</comment>
                            <comment id="85050" author="hornc" created="Wed, 28 May 2014 18:24:51 +0000"  >&lt;p&gt;Cray tested this patch. Without the patch we would occasionally hit this race condition leading to eviction and job failure. With the patch we stopped seeing the evictions.&lt;/p&gt;</comment>
                            <comment id="85489" author="cliffw" created="Mon, 2 Jun 2014 18:03:36 +0000"  >&lt;p&gt;The patch has been merged, so I will close this issue. &lt;/p&gt;</comment>
                            <comment id="85506" author="hornc" created="Mon, 2 Jun 2014 19:42:56 +0000"  >&lt;p&gt;Additional testing revealed that the patch has not completely closed the race window. We may want to keep this ticket open to track additional improvements/fixes. Otherwise we can open a new ticket when we have something to contribute.&lt;/p&gt;</comment>
                            <comment id="85510" author="cliffw" created="Mon, 2 Jun 2014 19:57:10 +0000"  >&lt;p&gt;It would be better to open up a new ticket, especially if you think there might be a delay. It is easy to link tickets if needed later.&lt;/p&gt;</comment>
                            <comment id="85513" author="hornc" created="Mon, 2 Jun 2014 20:17:40 +0000"  >&lt;p&gt;Thanks, sounds good.&lt;/p&gt;</comment>
                            <comment id="85826" author="jamesanunez" created="Thu, 5 Jun 2014 14:49:14 +0000"  >&lt;p&gt;patch for b2_5 at &lt;a href=&quot;http://review.whamcloud.com/#/c/10562&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/10562&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="86784" author="aboyko" created="Tue, 17 Jun 2014 08:38:28 +0000"  >&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/10735&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10735&lt;/a&gt; one more patch for master.&lt;/p&gt;</comment>
                            <comment id="86790" author="pjones" created="Tue, 17 Jun 2014 12:40:56 +0000"  >&lt;p&gt;Could you please track this latest patch under a new JIRA ticket? Thanks!&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="26210">LU-5554</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="16430">LU-2232</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwn7z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>14104</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>