<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:59:35 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6365] Eliminate unnecessary loop in lu_cache_shrink to improve performance</title>
                <link>https://jira.whamcloud.com/browse/LU-6365</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Customer has a test application that tries to allocate 20000 2M huge pages. After the node has be up and running for some period of time and fragmentation has occurred, the allocation takes several minutes before failing.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# time aprun -m20000h -L 2168 -n 2 -N 2 -ss -cc cpu -j 2 -d 24 ./prog2
&amp;gt; 
&amp;gt; Started at 11:34:02
&amp;gt; 
&amp;gt;   Node   Name   Huge Page(MB)   MemFree(MB) Small0 (MB)  Small1(MB)  Huge0(MB)   Huge1(MB)  
&amp;gt;    0 nid02168       40000       19908        9814        1109           0        8398
&amp;gt; 
&amp;gt; Allocated  19800MB
&amp;gt; At address      10000080000
&amp;gt; Application 48079344 is crashing. ATP analysis proceeding...
&amp;gt; 
&amp;gt; ATP Stack walkback for Rank 0 starting:
&amp;gt;   start_thread@0x2aaaaee6f805
&amp;gt;   _new_slave_entry@0x2aaaab3b0f79
&amp;gt;   memcheck__cray$mt$p0001@prog2.f90:103
&amp;gt;   sub_@sub.f90:5
&amp;gt;   sub2_@sub2.f90:9
&amp;gt;   touch_@touch.f90:1
&amp;gt; ATP Stack walkback for Rank 0 done
&amp;gt; Process died with signal 7: &apos;Bus error&apos;
&amp;gt; Forcing core dumps of ranks 0, 1
[skip]
&amp;gt; [NID 02168] 2015-03-09 11:34:13 Apid 48079344: Huge page could not be allocated.  Process terminated via bus error.
&amp;gt; Application 48079344 exit codes: 139
&amp;gt; Application 48079344 resources: utime ~92s, stime ~9s, Rss ~123412, inblocks ~628, outblocks ~98
&amp;gt; 
&amp;gt; real    2m19.012s
&amp;gt; user    0m0.984s
&amp;gt; sys     0m0.172s
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In researching the slowness, I noticed that the lu_cache_shrinker is taking noticeably longer to execute than the other shrinkers.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Kernel tracing - second column is time
           &amp;lt;...&amp;gt;-35127 [013] 2965136.085636: shrink_slab &amp;lt;-do_try_to_free_pages
           &amp;lt;...&amp;gt;-35127 [013] 2965136.085636: down_read_trylock &amp;lt;-shrink_slab
           &amp;lt;...&amp;gt;-35127 [013] 2965136.085636: shrink_dcache_memory &amp;lt;-shrink_slab
           &amp;lt;...&amp;gt;-35127 [013] 2965136.085636: shrink_icache_memory &amp;lt;-shrink_slab
           &amp;lt;...&amp;gt;-35127 [013] 2965136.085636: shrink_dqcache_memory &amp;lt;-shrink_slab
           &amp;lt;...&amp;gt;-35127 [013] 2965136.085637: lu_cache_shrink &amp;lt;-shrink_slab
           &amp;lt;...&amp;gt;-35127 [013] 2965136.085990: enc_pools_shrink &amp;lt;-shrink_slab
           &amp;lt;...&amp;gt;-35127 [013] 2965136.085990: ldlm_pools_srv_shrink &amp;lt;-shrink_slab
           &amp;lt;...&amp;gt;-35127 [013] 2965136.085994: ldlm_pools_cli_shrink &amp;lt;-shrink_slab
           &amp;lt;...&amp;gt;-35127 [013] 2965136.086004: up_read &amp;lt;-shrink_slab
           &amp;lt;...&amp;gt;-35127 [013] 2965136.086004: _cond_resched &amp;lt;-shrink_slab
           &amp;lt;...&amp;gt;-35127 [013] 2965136.086006: shrink_slab &amp;lt;-do_try_to_free_pages
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Most of the time in lu_cache_shrink is spent repeatedly  getting a spinlock in lu_site_stats_get().&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;&amp;gt;            &amp;lt;...&amp;gt;-46696 [012] 4654974.718236: cfs_hash_hd_hhead_size &amp;lt;-lu_site_stats_get
&amp;gt;            &amp;lt;...&amp;gt;-46696 [012] 4654974.718237: _raw_spin_lock &amp;lt;-cfs_hash_spin_lock
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The lu_cache_shrink algothithm loops over the entries in lu_sites (8 in this customer&apos;s case) calling lu_site_stats_get for each to compute the number of freeable objects. lu_site_stats_get itself loops over each bucket in the ls_obj_hash (256 buckets) to summing the lengths of the lru list in each bucket. &lt;/p&gt;

&lt;p&gt;The debug data suggests that most of the time taken by lu_cache_shrink is spent in lu_site_stats_get. This overhead can be eliminated simply by keeping an aggregated total of the lsb_lru_len from all buckets in the lu_site struct. In other words, keep a running count of total lru objects rather recomputing the total each time lu_cache_shrink is called.&lt;/p&gt;</description>
                <environment></environment>
        <key id="29095">LU-6365</key>
            <summary>Eliminate unnecessary loop in lu_cache_shrink to improve performance</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="amk">Ann Koehler</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Fri, 13 Mar 2015 20:45:04 +0000</created>
                <updated>Mon, 18 Jul 2016 19:54:46 +0000</updated>
                            <resolved>Mon, 10 Aug 2015 13:04:34 +0000</resolved>
                                    <version>Lustre 2.4.1</version>
                    <version>Lustre 2.5.0</version>
                                    <fixVersion>Lustre 2.8.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>9</watches>
                                                                            <comments>
                            <comment id="109683" author="gerrit" created="Fri, 13 Mar 2015 21:11:53 +0000"  >&lt;p&gt;Ann Koehler (amk@cray.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/14066&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/14066&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6365&quot; title=&quot;Eliminate unnecessary loop in lu_cache_shrink to improve performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6365&quot;&gt;&lt;del&gt;LU-6365&lt;/del&gt;&lt;/a&gt; obd: Eliminate hash bucket scans in lu_cache_shrink&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 70fae2a56157456c56bf63937c531735e206c166&lt;/p&gt;</comment>
                            <comment id="109685" author="amk" created="Fri, 13 Mar 2015 21:15:32 +0000"  >&lt;p&gt;Patch:  &lt;a href=&quot;http://review.whamcloud.com/14066&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/14066&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="121294" author="amk" created="Tue, 14 Jul 2015 22:15:43 +0000"  >&lt;p&gt;The following performance stats were collected from a single client with 2 Lustre file systems mounted. The data is from the Ftrace kernel tracing facility. The traces are for calls to lu_cache_shrink while trying to allocate hugepages with the command: &lt;/p&gt;

&lt;p&gt;echo 50000 &amp;gt; /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Function profile output without patch:
  Function                               Hit    Time            Avg             s^2  
  --------                               ---    ----            ---             ---  
  shrink_slab                         380928    112320774 us     294.860 us      29.819 us   
  lu_cache_shrink                     380928    103623988 us     272.030 us      25.932 us
  lu_site_stats_get                   761856    102356451 us     134.351 us      11.891 us   
  ldlm_pools_shrink                   761856    7272808 us     9.546 us        18.779 us
  ldlm_pools_cli_shrink               380928    5328679 us     13.988 us       1.275 us    
  ldlm_pools_srv_shrink               380928    2113786 us     5.549 us        0.492 us    
  enc_pools_shrink                    380928    428753.4 us     1.125 us        0.098 us  
  shrink_icache_memory                380928    24950.44 us     0.065 us        0.006 us
  shrink_dcache_memory                380928    24717.95 us     0.064 us        0.004 us
  shrink_dqcache_memory               380928    23187.41 us     0.060 us        0.005 us   

Function profile output with patch:
 Function                               Hit    Time            Avg             s^2 
  --------                               ---    ----            ---             --- 
  shrink_slab                         421376    9721834 us     23.071 us       25.307 us
  ldlm_pools_shrink                   842752    7500651 us     8.900 us        14.925 us   
  ldlm_pools_cli_shrink               421376    5375698 us     12.757 us       2.718 us    
  ldlm_pools_srv_shrink               421376    2302504 us     5.464 us        0.514 us    
  lu_cache_shrink                     421376    635577.7 us     1.508 us        0.127 us    
  enc_pools_shrink                    421376    486304.1 us     1.154 us        0.099 us
  shrink_icache_memory                421378    30136.85 us     0.071 us        20.470 us
  shrink_dcache_memory                421376    26503.42 us     0.062 us        0.003 us
  shrink_dqcache_memory               421376    25776.27 us     0.061 us        0.004 us

Function graph output without patch:
  7)               |  shrink_slab() {
  7)   0.084 us    |    down_read_trylock();
  7)   0.063 us    |    shrink_dcache_memory();
  7)   0.063 us    |    shrink_icache_memory();
  7)   0.074 us    |    shrink_dqcache_memory();
  7)               |    lu_cache_shrink() {
  7) ! 1160.507 us |    }
  7)               |    enc_pools_shrink() {
  7)   2.748 us    |    }
  7)               |    ldlm_pools_srv_shrink() {
  7) + 13.010 us   |    }
  7)               |    ldlm_pools_cli_shrink() {
  7) + 27.887 us   |    }
  7) ! 1210.108 us |  }

Function graph output with patch:
  6)               |  shrink_slab() {
  6)   0.182 us    |    down_read_trylock();
  6)   0.061 us    |    shrink_dcache_memory();
  6)   0.071 us    |    shrink_icache_memory();
  6)   0.062 us    |    shrink_dqcache_memory();
  6)               |    lu_cache_shrink() {
  6)   5.886 us    |    }
  6)               |    enc_pools_shrink() {
  6)   2.905 us    |    }
  6)               |    ldlm_pools_srv_shrink() {
  6) + 15.619 us   |    }
  6)               |    ldlm_pools_cli_shrink() {
  6) + 35.413 us   |    }
  6) + 67.345 us   |  }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Multiple measures were taken. Here&apos;s a brief summary:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;                       Time in             Time in                     %shrink_slab time
                       shrink_slab      lu_cache_shrink       in lu_cache_shrink
no patch        1210.108 us     1160.507 us             95%
no patch         623.078 us        576.420 us             92%
patch                67.345 us            5.886 us              8.7%
patch                54.808 us            2.604 us              4.75%
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="123668" author="gerrit" created="Sun, 9 Aug 2015 23:39:19 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/14066/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/14066/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6365&quot; title=&quot;Eliminate unnecessary loop in lu_cache_shrink to improve performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6365&quot;&gt;&lt;del&gt;LU-6365&lt;/del&gt;&lt;/a&gt; obd: Eliminate hash bucket scans in lu_cache_shrink&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: fc2f0c39d8bed5060774e3b4cca7bf13faa3692a&lt;/p&gt;</comment>
                            <comment id="123698" author="pjones" created="Mon, 10 Aug 2015 13:04:34 +0000"  >&lt;p&gt;Landed for 2.8&lt;/p&gt;</comment>
                            <comment id="132204" author="jaylan" created="Fri, 30 Oct 2015 19:28:40 +0000"  >&lt;p&gt;When trying to back port the commit to 2.5.3-fe, I got a conflict in lu_object_put():&lt;/p&gt;

&lt;p&gt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt; HEAD&lt;br/&gt;
        if (!lu_object_is_dying(top)) {&lt;br/&gt;
                LASSERT(cfs_list_empty(&amp;amp;top-&amp;gt;loh_lru));&lt;br/&gt;
                cfs_list_add_tail(&amp;amp;top-&amp;gt;loh_lru, &amp;amp;bkt-&amp;gt;lsb_lru);&lt;br/&gt;
=======&lt;br/&gt;
        if (!lu_object_is_dying(top) &amp;amp;&amp;amp;&lt;br/&gt;
            (lu_object_exists(orig) || lu_object_is_cl(orig))) {&lt;br/&gt;
                LASSERT(list_empty(&amp;amp;top-&amp;gt;loh_lru));&lt;br/&gt;
                list_add_tail(&amp;amp;top-&amp;gt;loh_lru, &amp;amp;bkt-&amp;gt;lsb_lru);&lt;br/&gt;
                bkt-&amp;gt;lsb_lru_len++;&lt;br/&gt;
                lprocfs_counter_incr(site-&amp;gt;ls_stats, LU_SS_LRU_LEN);&lt;br/&gt;
                CDEBUG(D_INODE, &quot;Add %p to site lru. hash: %p, bkt: %p, &quot;&lt;br/&gt;
                       &quot;lru_len: %ld\n&quot;,&lt;br/&gt;
                       o, site-&amp;gt;ls_obj_hash, bkt, bkt-&amp;gt;lsb_lru_len);&lt;br/&gt;
&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt; fc2f0c3... &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6365&quot; title=&quot;Eliminate unnecessary loop in lu_cache_shrink to improve performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6365&quot;&gt;&lt;del&gt;LU-6365&lt;/del&gt;&lt;/a&gt; obd: Eliminate hash bucket scans in lu_cache_shrink&lt;/p&gt;

&lt;p&gt;Could you provide me a back port? Or do I need to pick up another commit before this?&lt;br/&gt;
Thanks!&lt;/p&gt;
</comment>
                            <comment id="132215" author="amk" created="Fri, 30 Oct 2015 20:43:59 +0000"  >&lt;p&gt;Attached 2.5.3 patch file. If this doesn&apos;t work, give me a copy of your lustre/obdclass/lu_object.c and I&apos;ll make the necessary changes. &lt;/p&gt;</comment>
                            <comment id="132245" author="jaylan" created="Fri, 30 Oct 2015 22:15:07 +0000"  >&lt;p&gt;The conflict I had was caused by two LU&apos;s that I do not have: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6147&quot; title=&quot;sanity-lfsck test_4: &amp;#39;(7) unexpected status&amp;#39; &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6147&quot;&gt;&lt;del&gt;LU-6147&lt;/del&gt;&lt;/a&gt; and &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5722&quot; title=&quot;memory allocation deadlock under lu_cache_shrink()&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5722&quot;&gt;&lt;del&gt;LU-5722&lt;/del&gt;&lt;/a&gt;.&lt;br/&gt;
Are these prerequisites to this LU? If no dependency, I can resolve the cherry-pick conflicts.&lt;br/&gt;
Thank you for your help!&lt;/p&gt;</comment>
                            <comment id="133238" author="amk" created="Wed, 11 Nov 2015 15:18:18 +0000"  >&lt;p&gt;There&apos;s no dependency on &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6147&quot; title=&quot;sanity-lfsck test_4: &amp;#39;(7) unexpected status&amp;#39; &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6147&quot;&gt;&lt;del&gt;LU-6147&lt;/del&gt;&lt;/a&gt; or &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5722&quot; title=&quot;memory allocation deadlock under lu_cache_shrink()&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5722&quot;&gt;&lt;del&gt;LU-5722&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;(Sorry for the slow reply. I was out of the office and not reading e-mail.)&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="31645">LU-7038</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="19502" name="LU-6365_cray_2.5.3.diff" size="5374" author="amk" created="Fri, 30 Oct 2015 20:43:59 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzx8in:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>