<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:30:11 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-9885] Huge amount of costs in ptlrpc_wait_event() at file creation</title>
                <link>https://jira.whamcloud.com/browse/LU-9885</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;After metadata performance improvements on &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9796&quot; title=&quot;Speedup file creation under heavy concurrency   &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9796&quot;&gt;&lt;del&gt;LU-9796&lt;/del&gt;&lt;/a&gt;, we are hitting Lustre level bottleneck at File creation for scale.&lt;br/&gt;
collected cost of all Lustre functions at file creation, ptlrpc_wait_event() took huge a mount of time cost.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@mds05 ~]# /work/ihara/perf-tools/bin/funccost -L
Start setting lustre functions...
Setting lustre functions done...
Tracing &quot;[lustre modules functions]&quot;... Ctrl-C to end.
/work/ihara/perf-tools/bin/funccost: line 158: tracing_enabled: Permission denied
^C
FUNC                           TOTAL_TIME(us)       COUNT        AVG(us)       
ptlrpc_wait_event              1299908760           1745217      744.84      
ptlrpc_server_handle_request   279979924            876419       319.46      
tgt_request_handle             268911684            637004       422.15      
tgt_enqueue                    230880245            318456       725.00      
ldlm_lock_enqueue              230872744            1273842      181.24      
ldlm_handle_enqueue0           230791830            318456       724.72      
mdt_intent_policy              224904050            318456       706.23      
mdt_intent_opc                 223658324            318456       702.32      
mdt_intent_reint               223196495            318456       700.87      
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Why is ptlrpc_wait_event() taking a ot costs here?&lt;/p&gt;</description>
                <environment></environment>
        <key id="47843">LU-9885</key>
            <summary>Huge amount of costs in ptlrpc_wait_event() at file creation</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="pjones">Peter Jones</assignee>
                                    <reporter username="ihara">Shuichi Ihara</reporter>
                        <labels>
                    </labels>
                <created>Wed, 16 Aug 2017 04:14:33 +0000</created>
                <updated>Wed, 1 Sep 2021 17:25:29 +0000</updated>
                                            <version>Lustre 2.7.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="205489" author="adilger" created="Wed, 16 Aug 2017 07:53:39 +0000"  >&lt;p&gt;Is this CPU time or elapsed wall clock time?  The client threads are all sleeping in ptlrpc_wait_event() for the reply from the MDS. Having more RPCs in flight means that more threads will be waiting in parallel, even if it increases the aggregate create rate. &lt;/p&gt;

&lt;p&gt;If this is elapsed wall-clock time, then that is normal and not much can be done about it. If this is CPU time, then it might be made more efficient. &lt;/p&gt;</comment>
                            <comment id="205492" author="wangshilong" created="Wed, 16 Aug 2017 10:03:12 +0000"  >&lt;p&gt;Hi Andreas,&lt;/p&gt;

&lt;p&gt;   We used ftrace function profile which should not include sleep time, profile results&lt;br/&gt;
here should be CPU time.&lt;/p&gt;</comment>
                            <comment id="205498" author="jhammond" created="Wed, 16 Aug 2017 13:37:10 +0000"  >&lt;p&gt;Is this on 2.7.0 or a descendant? Could you describe the exact lustre revision and test used here?&lt;/p&gt;</comment>
                            <comment id="205643" author="pjones" created="Thu, 17 Aug 2017 17:31:51 +0000"  >&lt;p&gt;Do you see the same behaviour with Lustre 2.10?&lt;/p&gt;</comment>
                            <comment id="205651" author="adilger" created="Thu, 17 Aug 2017 17:54:50 +0000"  >&lt;p&gt;I was looking into the &lt;tt&gt;ptlrpc_wait_event()&lt;/tt&gt; code, and the only part that can take a lot of CPU time is:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        l_wait_event_exclusive_head(svcpt-&amp;gt;scp_waitq,
                                ptlrpc_thread_stopping(thread) ||
                                ptlrpc_server_request_incoming(svcpt) ||
                                ptlrpc_server_request_pending(svcpt, &lt;span class=&quot;code-keyword&quot;&gt;false&lt;/span&gt;) ||
                                ptlrpc_rqbd_pending(svcpt) ||
                                ptlrpc_at_check(svcpt), &amp;amp;lwi);

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;and of all those checks, the only one that is not just a simple variable/flag check is &lt;tt&gt;ptlrpc_server_request_pending()&lt;/tt&gt;. It looks like this may be hooking into the NRS code. Is it possible you have some NRS TBF patches/policies on this system that are causing this function to be slow? Alternately, it may be that if there is memory pressure on the MDS it may be that the &lt;tt&gt;ptlrpc_wait_event()&lt;/tt&gt; is just being called a lot due to the 0.1s timeout being set in this case:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;                &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (ptlrpc_rqbd_pending(svcpt) &amp;amp;&amp;amp;
                    ptlrpc_server_post_idle_rqbds(svcpt) &amp;lt; 0) {
                        /* I just failed to repost request buffers.
                         * Wait &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; a timeout (unless something &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt;
                         * happens) before I &lt;span class=&quot;code-keyword&quot;&gt;try&lt;/span&gt; again */
                        svcpt-&amp;gt;scp_rqbd_timeout = cfs_time_seconds(1) / 10;
                        CDEBUG(D_RPCTRACE, &lt;span class=&quot;code-quote&quot;&gt;&quot;Posted buffers: %d\n&quot;&lt;/span&gt;,
                               svcpt-&amp;gt;scp_nrqbds_posted);
                }

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;What would be useful here is a &lt;a href=&quot;http://www.brendangregg.com/flamegraphs.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;flame graph&lt;/a&gt; which shows where the CPU time is actually being spent.[&lt;/p&gt;
&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;http://www.brendangregg.com/flamegraphs.html%20flame%20graph]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;
</comment>
                            <comment id="205720" author="paf" created="Fri, 18 Aug 2017 02:37:56 +0000"  >&lt;p&gt;For what it&apos;s worth, I often see a lot of time spent here when dealing with high rates of small messages.  In my experience, when looking more closely, the culprit is the time spent doing the sleep/wake.  My data is client side, but the code is the same.&lt;/p&gt;

&lt;p&gt;I&apos;m not familiar with the tool you&apos;re using, does it break down the time spent in that function further (especially in what that function calls)?  If not, I&apos;d suggest using perf record/perf report (the flame graphs Andreas mentioned are a way to visualize the sort of data it generates) - it makes it easy to see where time is spent in a function and functions called from it.  If you are just charting time spent, then if the time is spread out between a few sub calls, they could be far down the list but together account for the time in that function.&lt;/p&gt;</comment>
                            <comment id="205722" author="ihara" created="Fri, 18 Aug 2017 04:16:51 +0000"  >&lt;p&gt;Thanks Andreas and Patrick&lt;br/&gt;
Yes, we just used a simple tool perf-tool to confirm whether if there are any places where are a lot of CPU consuming. Agreed, for now, we need to break down of each function for each CPU time. will try perftool to find what exact places for it.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="65896">LU-14976</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzzik7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>