<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:52:30 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-5557] enqueue and reint RPC are not tracked in MDS stats</title>
                <link>https://jira.whamcloud.com/browse/LU-5557</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;MDS &lt;tt&gt;stats&lt;/tt&gt; proc file &lt;tt&gt;/proc/fs/lustre/mds/MDS/mdt/stats&lt;/tt&gt; does not track information about LDLM_ENQUEUE and MDS_REINT RPCs.&lt;br/&gt;
This class of RPC covers most of &quot;modifying&quot; RPCs on MDS. This file displays mostly RPC that &quot;read&quot; data from MDT device and which is not &quot;writing&quot; on the device.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;$ cat /proc/fs/lustre/mds/MDS/mdt/stats
snapshot_time             1409239309.161365 secs.usecs
req_waittime              182 samples [usec] 17 420 19191 2604647
req_qdepth                182 samples [reqs] 0 1 3 3
req_active                182 samples [reqs] 1 3 251 403
req_timeout               182 samples [sec] 1 10 209 479
reqbuf_avail              463 samples [bufs] 64 64 29632 1896448
ldlm_ibits_enqueue        5 samples [reqs] 1 1 5 5
mds_getattr               1 samples [usec] 83 83 83 6889
mds_connect               6 samples [usec] 20 197 439 54031
mds_getstatus             1 samples [usec] 76 76 76 5776
mds_statfs                2 samples [usec] 74 95 169 14501
obd_ping                  167 samples [usec] 12 130 5875 249977
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;These class of RPCs are explicitly blacklisted in the code for a very long time.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;+++ b/lustre/ptlrpc/service.c
@@ -2110,7 +2110,7 @@ put_conn:
         if (likely(svc-&amp;gt;srv_stats != NULL &amp;amp;&amp;amp; request-&amp;gt;rq_reqmsg != NULL)) {
                 __u32 op = lustre_msg_get_opc(request-&amp;gt;rq_reqmsg);
                 int opc = opcode_offset(op);
                 if (opc &amp;gt; 0 &amp;amp;&amp;amp; !(op == LDLM_ENQUEUE || op == MDS_REINT)) {
                         LASSERT(opc &amp;lt; LUSTRE_MAX_OPCODES);
                         lprocfs_counter_add(svc-&amp;gt;srv_stats,
                                             opc + EXTRA_MAX_OPCODES,
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Is there some specific reasons to prevent that?&lt;/p&gt;

&lt;p&gt;Could we consider enabling them?&lt;/p&gt;</description>
                <environment></environment>
        <key id="26216">LU-5557</key>
            <summary>enqueue and reint RPC are not tracked in MDS stats</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="jhammond">John Hammond</assignee>
                                    <reporter username="adegremont">Aurelien Degremont</reporter>
                        <labels>
                            <label>llnl</label>
                    </labels>
                <created>Thu, 28 Aug 2014 15:28:51 +0000</created>
                <updated>Mon, 6 Oct 2014 12:02:09 +0000</updated>
                            <resolved>Mon, 6 Oct 2014 11:56:20 +0000</resolved>
                                    <version>Lustre 2.6.0</version>
                    <version>Lustre 2.5.2</version>
                    <version>Lustre 2.4.3</version>
                                    <fixVersion>Lustre 2.7.0</fixVersion>
                                        <due></due>
                            <votes>1</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="92728" author="adilger" created="Thu, 28 Aug 2014 17:16:15 +0000"  >&lt;p&gt;I think John already has a patch to fix this. &lt;/p&gt;</comment>
                            <comment id="92809" author="jhammond" created="Fri, 29 Aug 2014 15:56:32 +0000"  >&lt;p&gt;I did. I&apos;ll restore it and look at addressing your comments.&lt;/p&gt;

&lt;p&gt;On this subject, is it in out long term interest to replace these jumbo opcodes (MDS_REINT and LDLM_ENQUEUE) with specific opcodes (MDS_OPEN, MDS_CREATE, MDS_UNLINK, ...)? It has been pointed out that this would make RPC traces much more useful. I&apos;m not sure what &quot;reint&quot; means and I don&apos;t think that if I knew it would help anything.&lt;/p&gt;</comment>
                            <comment id="93046" author="adilger" created="Tue, 2 Sep 2014 21:41:46 +0000"  >&lt;p&gt;Once upon a time, there was a filesystem named Intermezzo that allowed clients to disconnect from the server while using and optionally modifying their locally cached copy of the data.  When the client reconnected to the server, it would reintegrate the log of changes that it had made locally to get the server copy back in sync with the client.  The thought for Lustre was to allow clients to eventually do the same thing.&lt;/p&gt;

&lt;p&gt;Initially, Lustre clients would only send individual reintegration records to the MDT to change the metadata, but in the future it would be possible to reintegrate a series of changes efficiently, allowing either writeback caching (WBC) clients and/or disconnected operation.  In that case, the type of any individual operation isn&apos;t known in advance, and there may in fact be multiple different operations sent in the same RPC.  Hence, there is only the MDS_REINT RPC type instead of separate RPC handlers for each update type.  That said, it would be possible to send different RPC types for statistical purposes, and have all of the RPC handlers be the same piece of code.&lt;/p&gt;

&lt;p&gt;Similarly, while LDLM_ENQUEUE today is commonly used for open (along with an open intent), it may be used for other kinds of locking operations on the MDS (e.g re-enqueue a lock in revalidate after it has been cancelled due to conflict) as well as extent locks on the OSS.  I don&apos;t think it would be possible to change LDLM_ENQUEUE to MDS_OPEN as a result.&lt;/p&gt;</comment>
                            <comment id="93165" author="adilger" created="Wed, 3 Sep 2014 23:24:25 +0000"  >&lt;p&gt;I also recently found &lt;a href=&quot;http://review.whamcloud.com/342&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/342&lt;/a&gt; which fixes up some of this same code.&lt;/p&gt;</comment>
                            <comment id="94010" author="jhammond" created="Mon, 15 Sep 2014 17:41:18 +0000"  >&lt;p&gt;&amp;gt; Similarly, while LDLM_ENQUEUE today is commonly used for open (along with an open intent), it may be used for other kinds of locking operations on the MDS (e.g re-enqueue a lock in revalidate after it has been cancelled due to conflict) as well as extent locks on the OSS. I don&apos;t think it would be possible to change LDLM_ENQUEUE to MDS_OPEN as a result.&lt;/p&gt;

&lt;p&gt;Then MDS_ENQUEUE_OPEN.&lt;/p&gt;</comment>
                            <comment id="94096" author="jhammond" created="Mon, 15 Sep 2014 21:02:52 +0000"  >&lt;p&gt;Please see &lt;a href=&quot;http://review.whamcloud.com/11924&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/11924&lt;/a&gt; for the reint stats.&lt;/p&gt;</comment>
                            <comment id="95691" author="pjones" created="Mon, 6 Oct 2014 11:56:20 +0000"  >&lt;p&gt;Landed for 2.7&lt;/p&gt;</comment>
                            <comment id="95697" author="adegremont" created="Mon, 6 Oct 2014 12:02:09 +0000"  >&lt;p&gt;Could we consider this for 2.5.4 ?&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwuuv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>15496</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>