<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:25:20 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2454] Reduce memory usage of ptlrpc stats </title>
                <link>https://jira.whamcloud.com/browse/LU-2454</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;As noted in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1282&quot; title=&quot;Lustre 2.1 client memory usage at mount is excessive&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1282&quot;&gt;&lt;del&gt;LU-1282&lt;/del&gt;&lt;/a&gt; the per-cpu component of a ptlrpc stats array uses 8K of memory.  For large core count, low memory platforms this is indeed excessive.  For example, a Xeon Phi that connects to 456 OSTs will eventually use 10% of memory in ptlrpc stats.&lt;/p&gt;

&lt;p&gt;The high memory use is due to the number of RPC opcodes which must be supported together with the implementation of the counter array.  When I counted there were 77 RPC opcodes supported plus 19 extra opcodes MDS_REINT_xxx and LDLM_ENQUEUE_....  This gets us to 8K after slab rounding: (77 + 19) * sizeof(lprocfs_counter) = 7680.  (We are only 6 opcodes away from using 16K per cpu.)  However I cannot find an instance of any client (mdc, osc, osp) that uses more than 14 opcodes.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# find /proc/fs/lustre/ -name stats -exec grep -q ^req_waittime {} \; -exec wc -l {} \; | column -t
7   /proc/fs/lustre/ldlm/services/ldlm_canceld/stats
7   /proc/fs/lustre/ldlm/services/ldlm_cbd/stats
10  /proc/fs/lustre/osp/lustre-OST0001-osc-MDT0000/stats
9   /proc/fs/lustre/osp/lustre-OST0000-osc-MDT0000/stats
5   /proc/fs/lustre/osp/lustre-MDT0000-osp-OST0001/stats
5   /proc/fs/lustre/osp/lustre-MDT0000-osp-OST0000/stats
5   /proc/fs/lustre/osp/lustre-MDT0000-osp-MDT0000/stats
9   /proc/fs/lustre/ost/OSS/ost_io/stats
7   /proc/fs/lustre/ost/OSS/ost_create/stats
14  /proc/fs/lustre/ost/OSS/ost/stats
7   /proc/fs/lustre/mdt/lustre-MDT0000/mdt_fld/stats
7   /proc/fs/lustre/mdt/lustre-MDT0000/mdt_mdss/stats
8   /proc/fs/lustre/mdt/lustre-MDT0000/mdt_readpage/stats
14  /proc/fs/lustre/mdt/lustre-MDT0000/mdt/stats
4   /proc/fs/lustre/mdt/lustre-MDT0000/stats
14  /proc/fs/lustre/mgs/MGS/mgs/stats
13  /proc/fs/lustre/osc/lustre-OST0001-osc-ffff8801e48f0000/stats
11  /proc/fs/lustre/osc/lustre-OST0000-osc-ffff8801e48f0000/stats
15  /proc/fs/lustre/mdc/lustre-MDT0000-mdc-ffff8801e48f0000/stats
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The largest being an mdc which used 14 (including MDS_REINT and LDLM_ENQUEUE):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# cat /proc/fs/lustre/mdc/lustre-MDT0000-mdc-ffff8801e48f0000/stats
snapshot_time             1355161746.36229 secs.usecs
req_waittime              3298 samples [usec] 57 221559 5204570 294225983354
req_active                3298 samples [reqs] 1 10 3669 5069
mds_getattr               56 samples [usec] 89 5071 21884 31118008
mds_getattr_lock          27 samples [usec] 595 916 17880 12026662
mds_close                 603 samples [usec] 180 3466 218004 95843514
mds_readpage              52 samples [usec] 329 1635 29738 19445852
mds_connect               2 samples [usec] 179 880 1059 806441
mds_getstatus             1 samples [usec] 57 57 57 3249
mds_statfs                20 samples [usec] 166 438 6112 1956252
mds_getxattr              175 samples [usec] 197 801 43570 12160700
ldlm_cancel               74 samples [usec] 222 6626 39097 64121733
obd_ping                  11 samples [usec] 63 587 3797 1560175
seq_query                 1 samples [usec] 185628 185628 185628 34457754384
fld_query                 2 samples [usec] 314 394 708 253832
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Based on this, I propose using an fixed length open-addressed hash to tally the opcodes used by each client.  If we set the number of slots to 16 then we can provide the same information using 1K per cpu.&lt;/p&gt;</description>
                <environment></environment>
        <key id="16884">LU-2454</key>
            <summary>Reduce memory usage of ptlrpc stats </summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="2">Won&apos;t Fix</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="jhammond">John Hammond</reporter>
                        <labels>
                            <label>client</label>
                    </labels>
                <created>Mon, 10 Dec 2012 13:36:13 +0000</created>
                <updated>Thu, 31 Jul 2014 14:46:37 +0000</updated>
                            <resolved>Thu, 31 Jul 2014 14:46:37 +0000</resolved>
                                    <version>Lustre 2.4.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="49002" author="jhammond" created="Mon, 10 Dec 2012 13:48:56 +0000"  >&lt;p&gt;See &lt;a href=&quot;http://review.whamcloud.com/4792&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/4792&lt;/a&gt; for a draft patch.&lt;/p&gt;</comment>
                            <comment id="49003" author="jhammond" created="Mon, 10 Dec 2012 14:00:10 +0000"  >&lt;p&gt;&lt;b&gt;In the description I should have said &quot;not including MDS_REINT and LDLM_ENQUEUE&quot;.&lt;/b&gt;&lt;/p&gt;

&lt;p&gt;Note that this patch leaves the old ptlrpc stats in-place for the sake of comparison.  The new stats (intended to replace the old) are available in /proc/ under the name ptlrpc_cli_stats.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# find /proc/fs/lustre/ -name ptlrpc_cli_stats
/proc/fs/lustre/osp/lustre-OST0001-osc-MDT0000/ptlrpc_cli_stats
/proc/fs/lustre/osp/lustre-OST0000-osc-MDT0000/ptlrpc_cli_stats
/proc/fs/lustre/osp/lustre-MDT0000-osp-OST0001/ptlrpc_cli_stats
/proc/fs/lustre/osp/lustre-MDT0000-osp-OST0000/ptlrpc_cli_stats
/proc/fs/lustre/osp/lustre-MDT0000-osp-MDT0000/ptlrpc_cli_stats
/proc/fs/lustre/mdt/lustre-MDT0000/ptlrpc_cli_stats
/proc/fs/lustre/osc/lustre-OST0001-osc-ffff8801e48f0000/ptlrpc_cli_stats
/proc/fs/lustre/osc/lustre-OST0000-osc-ffff8801e48f0000/ptlrpc_cli_stats
/proc/fs/lustre/mdc/lustre-MDT0000-mdc-ffff8801e48f0000/ptlrpc_cli_stats
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# cat /proc/fs/lustre/osc/lustre-OST0001-osc-ffff8801e48f0000/stats
snapshot_time             1355161403.512704 secs.usecs
req_waittime              80 samples [usec] 60 11830 72274 315613446
req_active                80 samples [reqs] 1 3 85 97
write_bytes               7 samples [bytes] 3 786432 1287584 742367009314
ost_setattr               23 samples [usec] 349 3183 13524 15072470
ost_write                 7 samples [usec] 1207 11830 35510 285830346
ost_connect               1 samples [usec] 298 298 298 88804
ost_punch                 4 samples [usec] 442 1463 3087 3035753
ost_statfs                4 samples [usec] 85 295 767 181775
ldlm_cancel               6 samples [usec] 318 1127 3907 3172199
obd_ping                  4 samples [usec] 60 204 495 72161
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# cat /proc/fs/lustre/osc/lustre-OST0001-osc-ffff8801e48f0000/ptlrpc_cli_stats
req_waittime              77 samples [usec] 60 11830 71065 315091891
req_active                77 samples [reqs] 1 3 82 94
write_bytes               7 samples [bytes] 3 786432 1287584 742367009314
obd_ping                  4 samples [usec] 60 204 495 72161
ost_setattr               22 samples [usec] 349 3183 12989 14786245
ost_write                 7 samples [usec] 1207 11830 35510 285830346
ldlm_enqueue              30 samples [usec] 273 1256 14285 7999137
ldlm_cancel               6 samples [usec] 318 1127 3907 3172199
ost_connect               1 samples [usec] 298 298 298 88804
ost_punch                 4 samples [usec] 442 1463 3087 3035753
ost_statfs                3 samples [usec] 85 295 494 107246
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;(The sanpshot_time line is missing but easily added if people are in to that.)&lt;/p&gt;

&lt;p&gt;The use of atomic_t entry/exit counters to protect accesses to the&lt;br/&gt;
count, min, max, sum, sum_sq has been copied over from the original&lt;br/&gt;
implementation.  This is not to say that I endorse it---there are&lt;br/&gt;
several palces where I believe that rmb() or wmb() should be inserted&lt;br/&gt;
for correctness.  I wondered if seqlock_t was evaluated for this&lt;br/&gt;
purpose, and if so why wasn&apos;t it used?&lt;/p&gt;

&lt;p&gt;I ran sanity with the attached patch 7c3dd04 and found that the probing performed well.  With 134975 calls to pcs_search() there were only 2836 cases in which the first probe missed.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvdin:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5796</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>