<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:10:27 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-14517] Decrease default lru_max_age value</title>
                <link>https://jira.whamcloud.com/browse/LU-14517</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;lru_max_age is set to 65min for ages. This is a looong time and that makes clients keep lots of LDLM locks in cache, and more data in cache. The only reason for that, that I&apos;m aware of, is to help login nodes to avoid them requesting the same locks too often.&lt;/p&gt;

&lt;p&gt;I know a lot of sites tune this value to something much smaller. Compute nodes have a more agressive value usually, and the LRU cache is clear between jobs anyway.&lt;/p&gt;

&lt;p&gt;I think it would be valuable to change the default value to a much reasonable one to avoid most of users to decrease this value. I think going down to 5 min would be a good move.&lt;/p&gt;

&lt;p&gt;If login nodes has to re-enqueue some locks every 5 min I think it is not a problem at all.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;What do you think of that?&lt;/p&gt;</description>
                <environment></environment>
        <key id="63324">LU-14517</key>
            <summary>Decrease default lru_max_age value</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="degremoa">Aurelien Degremont</reporter>
                        <labels>
                    </labels>
                <created>Fri, 12 Mar 2021 17:05:30 +0000</created>
                <updated>Sun, 21 Jan 2024 19:14:59 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="294864" author="simmonsja" created="Fri, 12 Mar 2021 17:59:36 +0000"  >&lt;p&gt;The reason this is done is that a common work case is to do a checkpoint every hour.&lt;/p&gt;</comment>
                            <comment id="294880" author="adilger" created="Fri, 12 Mar 2021 20:30:49 +0000"  >&lt;p&gt;Aurelien, I wouldn&apos;t be against this. Maybe 5 minutes is a bit too short, but 10 minutes is better?  This is set with &lt;tt&gt;LDLM_DEFAULT_MAX_ALIVE&lt;/tt&gt; value. &lt;/p&gt;

&lt;p&gt;See also &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6402&quot; title=&quot;reduce the value of LDLM_POOL_MAX_AGE&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6402&quot;&gt;LU-6402&lt;/a&gt;, which is the &lt;tt&gt;LDLM_POOL_MAX_AGE&lt;/tt&gt; value, which is still at the &lt;em&gt;very&lt;/em&gt; old 36000s/10h value, which controls how (badly) the dynamic LRU pressure on the client is keeping locks on the client. It makes sense to reduce this to at least 3600s, but probably also lower.&lt;/p&gt;
</comment>
                            <comment id="294883" author="adilger" created="Fri, 12 Mar 2021 21:03:59 +0000"  >&lt;p&gt;Aurelien, note that &lt;tt&gt;lru_max_age&lt;/tt&gt; should really only be a fallback upper limit for the dynamic LRU pool management.  Unfortunately, the dynamic LRU code has not been working well for a long time (see &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7266&quot; title=&quot;Fix LDLM pool to make LRUR working properly&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7266&quot;&gt;LU-7266&lt;/a&gt;), and often users disable it by setting &lt;tt&gt;lru_max_age&lt;/tt&gt; and &lt;tt&gt;lru_size=N&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;However, that is sub-optimal since it means some clients may have too many locks, while others too few, and setting too high a limit causes memory pressure on the servers and/or clients.&lt;/p&gt;

&lt;p&gt;What is really needed here is some investigation into the LDLM pool &quot;Lock Volume&quot; calculations to see why this is not working.  The basic theory is that sum(age of locks) is a &quot;volume&quot; that the server distributes among clients, and the client can manage locks within that volume as it sees fit (many short-lived locks, few long-lived locks), and if the client lock volme is growing to exceed its assigned limit (due to aging of old locks and/or acquiring many new locks) then it should cancel the oldest unused locks to reduce the volume again.  The client is really in the best position to judge which of its locks are most important, but as a workaround to memory pressure issues, &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6529&quot; title=&quot;Server side lock limits to avoid unnecessary memory exhaustion&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6529&quot;&gt;&lt;del&gt;LU-6529&lt;/del&gt;&lt;/a&gt; was implemented to give the server the ability to cancel locks more aggressively to avoid OOM.&lt;/p&gt;

&lt;p&gt;It &lt;em&gt;may&lt;/em&gt; be that &lt;tt&gt;LDLM_POOL_MAX_AGE&lt;/tt&gt; is just set much too high and/or the DLM server is allowing too much memory to be put toward locks (e.g. not considering multiple namespaces, or just assigning too large a fraction of RAM to LDLM vs. filesystem cache, etc), so the clients are not cancelling locks aggressively enough.  There may also be issues w.r.t. hooking into the kernel slab cache shrinkers not working properly (this should reduce the lock volume on the server to force clients to cancel locks, and on the client to directly cancel locks).&lt;/p&gt;


&lt;p&gt;The other area that could benefit is replacing the strict LRU managing the locks on the client.  For clients doing things like filesystem scanning, strict LRU is not a very good algorithm, since that flushes out &quot;valuable&quot; locks too quickly (e.g. parent directory locks) and doesn&apos;t drop &quot;boring&quot; locks (e.g. the use-once locks for the individual files).  Using a better caching algorithm (e.g. LFRU, 2Q/SLRU, ARC) would go a long way to improving lock cache usage on the client.  ARC is probably the best choice, since it would be possible to keep the FIDs in the &quot;ghost&quot; lists without actually caching the lock/pages, and in case some frequently-used lock had to be cancelled due to contention it doesn&apos;t immediately lose the &quot;value&quot; that had been built up for that lock.&lt;/p&gt;</comment>
                            <comment id="294984" author="degremoa" created="Mon, 15 Mar 2021 10:14:47 +0000"  >&lt;p&gt;I can see the bad effects of a fixed size &lt;tt&gt;lru_size&lt;/tt&gt;. That&apos;s why I&apos;m not using it and this value makes more sense as a read-only one nowadays.&lt;/p&gt;

&lt;p&gt;However, I may have been misunderstanding the behavior of lru_max_age. I&apos;m using it as a way to limit dynamically number of locks on client side, based on access pattern. I understood it was the time since the resource last access. FS scan can bring a lot of locks on client indeed, but every time a directory is accessed again I thought it was considered &quot;young&quot; again and not evicted from cache. I&apos;m also using that to force client to flush dirty cache more aggressively.&lt;/p&gt;</comment>
                            <comment id="295032" author="adilger" created="Mon, 15 Mar 2021 18:36:49 +0000"  >&lt;p&gt;You are correct &lt;tt&gt;lru_max_age&lt;/tt&gt; is based on the idle time of a lock, which is refreshed on lock usage.  However, consider the competing needs of keeping an often used lock in cache vs. flushing many use-once locks from cache.  If &lt;tt&gt;lru_max_age&lt;/tt&gt; is high, it helps locks that may be used repeatedly, but not continuously, but if &lt;tt&gt;lru_max_age&lt;/tt&gt; is high then the client may accumulate a large number of use-once locks before they age out.  If &lt;tt&gt;lru_max_age&lt;/tt&gt; is low to keep too many use-once locks out of cache locks reused many times may also be cancelled if there is any gap in their usage. Currently, there is no frequency counter on locks, so all locks that hit &lt;tt&gt;lru_max_age&lt;/tt&gt; (or &lt;tt&gt;lru_size&lt;/tt&gt; if set) will be cancelled regardless of how useful they are to the client.  That is what &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11509&quot; title=&quot;LDLM: replace lock LRU with improved cache algorithm&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11509&quot;&gt;LU-11509&lt;/a&gt; is about - improving the algorithm for flushing locks from the cache instead of strict LRU.&lt;/p&gt;</comment>
                            <comment id="295085" author="degremoa" created="Tue, 16 Mar 2021 09:36:22 +0000"  >&lt;p&gt;I was not against the advantages of replacing the LRU algorithm by something else like ARC, I think it makes sense.&lt;/p&gt;

&lt;p&gt;My point was only that: 65minutes as the default value is too big in my opinion, that is making clients keep unused locks for that long. If we reduce that to, let&apos;s say 10 minutes, the lock volume will be smaller, the lock callback traffic will be smaller (less lock conflicts, less eviction due to lock callbacks) at a price of a increased lock enqueue traffic, which I suspect will be small. I&apos;m curious to know the p90 lock age on most clients.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="32543">LU-7266</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="29248">LU-6402</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="29728">LU-6529</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="53583">LU-11509</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="37939" name="lru-resize-dld.lyx" size="65250" author="adilger" created="Fri, 12 Mar 2021 21:06:23 +0000"/>
                            <attachment id="37940" name="lru-resize-hld.lyx" size="23645" author="adilger" created="Fri, 12 Mar 2021 21:06:45 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i01pcf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>