<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:43:59 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4578] Early replies do not honor at_max</title>
                <link>https://jira.whamcloud.com/browse/LU-4578</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;There seems to be a flaw in the logic which extends the request deadline when sending early reply. This bit of code is supposed to check whether we&apos;re able to extend the deadline:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;                /* Fake our processing time into the future to ask the clients
                 * for some extra amount of time */
                at_measured(&amp;amp;svcpt-&amp;gt;scp_at_estimate, at_extra +
                            cfs_time_current_sec() -
                            req-&amp;gt;rq_arrival_time.tv_sec);

                /* Check to see if we&apos;ve actually increased the deadline -
                 * we may be past adaptive_max */
                if (req-&amp;gt;rq_deadline &amp;gt;= req-&amp;gt;rq_arrival_time.tv_sec +
                    at_get(&amp;amp;svcpt-&amp;gt;scp_at_estimate)) {
                        DEBUG_REQ(D_WARNING, req, &quot;Couldn&apos;t add any time &quot;
                                  &quot;(%ld/%ld), not sending early reply\n&quot;,
                                  olddl, req-&amp;gt;rq_arrival_time.tv_sec +
                                  at_get(&amp;amp;svcpt-&amp;gt;scp_at_estimate) -
                                  cfs_time_current_sec());
                        RETURN(-ETIMEDOUT);
                }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This logic looks sound to me. The service estimate is bounded by at_max, so arrival time + service estimate is guaranteed to be less than or equal to at_max seconds after arrival time. If the current deadline is equal or greater than that, we cannot extend the deadline. The problem is with calculating the new deadline when the current deadline is less than the arrival time + service estimate sum:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;        newdl = cfs_time_current_sec() + at_get(&amp;amp;svcpt-&amp;gt;scp_at_estimate);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here we offset the current time rather than the arrival time. This is not only inconsistent with the above logic, but it can also result in a situation where we extend the deadline well past at_max seconds from the arrival time. In one instance we noticed timeouts on servers of 1076 seconds, and associated timeouts on clients of over 1300 seconds.&lt;/p&gt;

&lt;p&gt;Similarly, on the client side, when an early reply is received we currently extend the deadline thusly:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;        req-&amp;gt;rq_deadline = cfs_time_current_sec() + req-&amp;gt;rq_timeout +
                           ptlrpc_at_get_net_latency(req);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There are two things I think are potentially wrong with this. The first is that we again are offsetting the current time rather than, say, rq_sent or rq_arrival, which would make more sense to me. Secondly, and more importantly, the sum of req-&amp;gt;rq_timeout and net latency can exceed at_max seconds. If my reading of the code is correct, both rq_timeout and net latency are bounded by at_max, so in worst case the sum could be 2*at_max.&lt;/p&gt;

&lt;p&gt;I plan to push a patch to fix the server side issue.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;-	newdl = cfs_time_current_sec() + at_get(&amp;amp;svcpt-&amp;gt;scp_at_estimate);
+	newdl = req-&amp;gt;rq_arrival_time.tv_sec + at_get(&amp;amp;svcpt-&amp;gt;scp_at_estimate);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;One concern I have with this change is how it might affect things during recovery. I&apos;m not as familiar with the recovery mode, so there may be some assumptions in there that this patch violates. Hopefully Maloo will let me know if this is the case &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;

&lt;p&gt;I have similar patch for the client side&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;-	req-&amp;gt;rq_deadline = cfs_time_current_sec() + req-&amp;gt;rq_timeout +
-			   ptlrpc_at_get_net_latency(req);
+	req-&amp;gt;rq_deadline = req-&amp;gt;rq_sent + min_t(int, at_max, req-&amp;gt;rq_timeout +
+						ptlrpc_at_get_net_latency(req));
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Same sort of concern w.r.t recovery mode. For now, I&apos;m just planning on pushing the server side change. Thanks in advance for any feedback.&lt;/p&gt;</description>
                <environment></environment>
        <key id="22981">LU-4578</key>
            <summary>Early replies do not honor at_max</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="cliffw">Cliff White</assignee>
                                    <reporter username="hornc">Chris Horn</reporter>
                        <labels>
                            <label>mn4</label>
                            <label>patch</label>
                    </labels>
                <created>Mon, 3 Feb 2014 21:16:23 +0000</created>
                <updated>Fri, 17 Oct 2014 18:41:13 +0000</updated>
                            <resolved>Thu, 8 May 2014 14:52:44 +0000</resolved>
                                                    <fixVersion>Lustre 2.6.0</fixVersion>
                    <fixVersion>Lustre 2.5.2</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>15</watches>
                                                                            <comments>
                            <comment id="76135" author="hornc" created="Mon, 3 Feb 2014 21:26:26 +0000"  >&lt;p&gt;For your consideration:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/9100&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/9100&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="77260" author="hornc" created="Tue, 18 Feb 2014 17:02:05 +0000"  >&lt;p&gt;Has anyone had a chance to look at this?&lt;/p&gt;</comment>
                            <comment id="77274" author="hornc" created="Tue, 18 Feb 2014 18:12:35 +0000"  >&lt;p&gt;I pushed a revised version of my client side patch to gerrit: &lt;a href=&quot;http://review.whamcloud.com/#/c/9298/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/9298/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I dropped the capping of rq_timeout + net_latency, but changed from offsetting current time to offsetting the rq_sent time. The thought is that this is more in line with rq_arrival time which the server is offsetting.&lt;/p&gt;</comment>
                            <comment id="81490" author="tappro" created="Sat, 12 Apr 2014 20:35:38 +0000"  >&lt;p&gt;Chris, as I understand the deadline is supposed to be increased upon sending early reply, if you change its logic to be always arrival_time + at_get then it is just stay unchanged after early reply. That is why there is current_sec() is used - just to extend deadline, the client is notified about that as well and keeps the same logic.&lt;/p&gt;

&lt;p&gt;Meanwhile I don&apos;t see how that dishonor at_max? The rq_deadline is checked against arrival time and cannot exceed at_max, at least on server. Could you specify how exactly at_max can be exceeded? I tend to think right now that it is not about rq_deadline, but we need to check at_max is not exceeded explicitly in the place where that may happens.&lt;/p&gt;
</comment>
                            <comment id="81550" author="hornc" created="Mon, 14 Apr 2014 17:21:32 +0000"  >&lt;p&gt;Mikhail, thanks for taking a look at this.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;if you change its logic to be always arrival_time + at_get then it is just stay unchanged after early reply.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;arrival_time + at_get is &lt;em&gt;not&lt;/em&gt; constant. The service estimate is increased each time we send an early reply:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;                /* Fake our processing time into the future to ask the clients
                 * for some extra amount of time */
                at_measured(&amp;amp;svcpt-&amp;gt;scp_at_estimate, at_extra +
                            cfs_time_current_sec() -
                            req-&amp;gt;rq_arrival_time.tv_sec);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;Meanwhile I don&apos;t see how that dishonor at_max? The rq_deadline is checked against arrival time and cannot exceed at_max, at least on server.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;As I said in the original ticket description, the problem is not in checking the &lt;em&gt;current&lt;/em&gt; rq_deadline. The problem is when we calculate the &lt;em&gt;new&lt;/em&gt; deadline. Here&apos;s an example from a customer system:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Jan 17 11:35:56 snx11061n059 kernel: [240061.551119] Lustre: 12824:0:(service.c:1138:ptlrpc_at_send_early_reply()) @@@ Couldn&apos;t add any time (5/-471), not sending early reply
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In this instance we can see:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt; olddl = req-&amp;gt;rq_deadline - cfs_curr_time() = 5
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;and&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt; req-&amp;gt;rq_arrival_time.tv_sec + at_get(&amp;amp;svcpt-&amp;gt;scp_at_estimate) - cfs_curr_time() = -471
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We can simplify things by using an arrival time of 0, and service estimate of max value 600 seconds. In this case we have:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;req-&amp;gt;rq_arrival_time.tv_sec + at_get(&amp;amp;svcpt-&amp;gt;scp_at_estimate) - cfs_curr_time() =
0 + 600 - cfs_curr_time() = -471
cfs_curr_time = 471 + 600 = 1071
and
req-&amp;gt;rq_deadline = 5 + cfs_curr_time() = 5 + 1071 = 1076
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So, our deadline is 1076 seconds after the arrival time. Indeed, this was the case for several rpcs on the server. For example,&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Jan 17 11:35:56 snx11061n059 kernel: [240061.551119] Lustre: 12824:0:(service.c:1138:ptlrpc_at_send_early_reply()) @@@ Couldn&apos;t add any time (5/-471), not sending early reply
Jan 17 11:35:56 snx11061n059 kernel: [240061.551122]   req@ffff88058bcd0800 x1457392512529764/t0(0) o3-&amp;gt;b7ded670-aa13-1196-967c-72a7a3cb5c3d@3627@gni:0/0 lens 488/400 e 3 to 0 dl 1389958561 ref 2 fl Interpret:/0/0 rc 0/0
Jan 17 11:35:56 snx11061n059 kernel: [240061.579690] Lustre: 12824:0:(service.c:1138:ptlrpc_at_send_early_reply()) Skipped 57 previous similar messages
Jan 17 11:36:01 snx11061n059 kernel: [240066.648048] LustreError: 11673:0:(ldlm_lib.c:2821:target_bulk_io()) @@@ timeout on bulk PUT after 1076+0s  req@ffff88058bcd0800 x1457392512529764/t0(0) o3-&amp;gt;b7ded670-aa13-1196-967c-72a7a3cb5c3d@3627@gni:0/0 lens 488/400 e 3 to 0 dl 1389958561 ref 1 fl Interpret:/0/0 rc 0/0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Okay, so we know that the request deadline was postponed several times until it was well over at_max seconds, and we can see why it stopped at 1076. The remaining question is how it got the 1076 value in the first place.&lt;/p&gt;

&lt;p&gt;We can see from the earlier discussion that as long as the current deadline is less than arrival + service estimate seconds, then the server will send the early reply and extend the deadline. If the service estimate is maxed out at at_max seconds (600), then a deadline anywhere between 0 and 599 seconds after arrival time will be extended, and in this case it will be extended by 600 seconds&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;         newdl = cfs_time_current_sec() + at_get(&amp;amp;svcpt-&amp;gt;scp_at_estimate);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So, if we happen to enter ptlrpc_at_send_early_reply() 476 seconds after the request arrived, and our service estimate is currently 600 seconds, then our new deadline will be 1076 seconds.&lt;/p&gt;</comment>
                            <comment id="81706" author="tappro" created="Wed, 16 Apr 2014 05:30:15 +0000"  >&lt;p&gt;Yes, I tend to agree, at_measured() sets new at_current with honoring the at_max, so new deadline will be extended up to that value.&lt;/p&gt;</comment>
                            <comment id="82809" author="simmonsja" created="Tue, 29 Apr 2014 21:54:30 +0000"  >&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/#/c/9100&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/9100&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Tried this patch and the client evictions I was experiencing went way. &lt;/p&gt;</comment>
                            <comment id="82810" author="spitzcor" created="Tue, 29 Apr 2014 22:03:55 +0000"  >&lt;p&gt;James, what was your test case with single shared files?&lt;/p&gt;</comment>
                            <comment id="82811" author="simmonsja" created="Tue, 29 Apr 2014 22:16:52 +0000"  >&lt;p&gt;First I created a directory of stripe count 56 i.e test_ior and run the below command.&lt;/p&gt;

&lt;p&gt;aprun -n 288 /lustre/sultan/scratch/jsimmons/IOR -a POSIX -i 5 -C -v -g -w -r -e -b 1024m -t 4m -o /lustre/sultan/scratch/jsimmons/test_ior/testfile.out&lt;/p&gt;

&lt;p&gt;Without this patch the job would fail and the compute nodes would be evicted.&lt;/p&gt;

&lt;p&gt;Ticket &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4963&quot; title=&quot;client eviction during IOR test - lock callback timer expired&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4963&quot;&gt;&lt;del&gt;LU-4963&lt;/del&gt;&lt;/a&gt; has a similar reproducer.&lt;/p&gt;</comment>
                            <comment id="82853" author="simmonsja" created="Wed, 30 Apr 2014 15:05:05 +0000"  >&lt;p&gt;More results from testing last night. I found this patch doesn&apos;t completely eliminate the client evictions but it does greatly reduce their occurrence.This is a still a great patch to land to master as well as b2_5. &lt;/p&gt;</comment>
                            <comment id="83295" author="simmonsja" created="Tue, 6 May 2014 14:10:53 +0000"  >&lt;p&gt;Cherry picked for b2_5 - &lt;a href=&quot;http://review.whamcloud.com/#/c/10230&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/10230&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="83510" author="pjones" created="Thu, 8 May 2014 14:52:44 +0000"  >&lt;p&gt;Landed for 2.6. Will consider separately for b2_4 and b2_5 branches&lt;/p&gt;</comment>
                            <comment id="83641" author="cliffw" created="Fri, 9 May 2014 16:40:52 +0000"  >&lt;p&gt;The patch has been landed in master, b2_5 version still waiting for reviewers.&lt;/p&gt;</comment>
                            <comment id="84385" author="simmonsja" created="Mon, 19 May 2014 16:34:32 +0000"  >&lt;p&gt;Landed for b2_5 as well.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="24748">LU-5077</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="24752">LU-5079</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwe5j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>12504</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>