<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:25:05 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-9313] Soft lockup in ldlm_prepare_lru_list when at lock LRU limit</title>
                <link>https://jira.whamcloud.com/browse/LU-9313</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;When we&apos;ve hit the LDLM lock LRU limit and are going in to lock reclaim/cancellation (either because we set an explicit lru_size or because the server is limiting the client lock count), we sometimes see soft lockups on the namespace lock (ns_lock) in ldlm_prepare_lru_list, called from the elc code.&lt;/p&gt;

&lt;p&gt;For example:&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.635458&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0c3278a&amp;gt;&amp;#93;&lt;/span&gt; ldlm_prepare_lru_list+0x1aa/0x500 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.643442&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0c367a5&amp;gt;&amp;#93;&lt;/span&gt; ldlm_cancel_lru_local+0x15/0x40 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.651232&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0c369dc&amp;gt;&amp;#93;&lt;/span&gt; ldlm_prep_elc_req+0x20c/0x480 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.658828&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0c36c74&amp;gt;&amp;#93;&lt;/span&gt; ldlm_prep_enqueue_req+0x24/0x30 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.666606&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0f7abe1&amp;gt;&amp;#93;&lt;/span&gt; osc_enqueue_base+0x1c1/0x6e0 &lt;span class=&quot;error&quot;&gt;&amp;#91;osc&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.673796&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0f84147&amp;gt;&amp;#93;&lt;/span&gt; osc_lock_enqueue+0x357/0xa00 &lt;span class=&quot;error&quot;&gt;&amp;#91;osc&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.681002&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa09d8813&amp;gt;&amp;#93;&lt;/span&gt; cl_lock_enqueue+0x63/0x120 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdclass&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.688511&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0dd6ecc&amp;gt;&amp;#93;&lt;/span&gt; lov_lock_enqueue+0x9c/0x170 &lt;span class=&quot;error&quot;&gt;&amp;#91;lov&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.695616&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa09d8813&amp;gt;&amp;#93;&lt;/span&gt; cl_lock_enqueue+0x63/0x120 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdclass&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.703133&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa09d8d62&amp;gt;&amp;#93;&lt;/span&gt; cl_lock_request+0x62/0x1e0 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdclass&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.710649&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0edf587&amp;gt;&amp;#93;&lt;/span&gt; cl_glimpse_lock+0x337/0x3d0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.718057&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0edf8e7&amp;gt;&amp;#93;&lt;/span&gt; cl_glimpse_size0+0x1b7/0x1c0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.725562&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0edac65&amp;gt;&amp;#93;&lt;/span&gt; ll_agl_trigger+0x115/0x4a0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.732871&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0edb14d&amp;gt;&amp;#93;&lt;/span&gt; ll_agl_thread+0x15d/0x4b0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.740075&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81077874&amp;gt;&amp;#93;&lt;/span&gt; kthread+0xb4/0xc0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;995914.745610&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81523498&amp;gt;&amp;#93;&lt;/span&gt; ret_from_fork+0x58/0x90&lt;/p&gt;

&lt;p&gt;The contention here is easy to reproduce by creating a few directories with a large number of small files (~100,000 per directory worked for me), then starting a number of ls processes - For example, doing:&lt;br/&gt;
ls -laR * &amp;gt; /dev/null &amp;amp;&lt;/p&gt;

&lt;p&gt;A few times.  (It is helpful if all files are on the same OST.)&lt;/p&gt;

&lt;p&gt;When the lru limit is hit (it&apos;s easiest to see by setting lru_size limit manually), contention on the namespace lock from the elc code becomes very painful.  Even if soft lockups do not occur, a quick perf record shows most time being spent on this lock.&lt;/p&gt;

&lt;p&gt;This badly impacts the performance of the ls processes as well.&lt;/p&gt;

&lt;p&gt;My proposed solution is to limit ELC so to one process per namespace.  In Cray testing, this solves the problem nicely, but still lets ELC function.&lt;/p&gt;</description>
                <environment></environment>
        <key id="45390">LU-9313</key>
            <summary>Soft lockup in ldlm_prepare_lru_list when at lock LRU limit</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="paf">Patrick Farrell</assignee>
                                    <reporter username="paf">Patrick Farrell</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Mon, 10 Apr 2017 19:09:18 +0000</created>
                <updated>Thu, 22 Nov 2018 18:02:35 +0000</updated>
                            <resolved>Thu, 22 Nov 2018 18:02:35 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="191415" author="gerrit" created="Mon, 10 Apr 2017 19:16:36 +0000"  >&lt;p&gt;Patrick Farrell (paf@cray.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/26477&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/26477&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9313&quot; title=&quot;Soft lockup in ldlm_prepare_lru_list when at lock LRU limit&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9313&quot;&gt;&lt;del&gt;LU-9313&lt;/del&gt;&lt;/a&gt; ldlm: Limit elc to one thread per namespace&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 32788dd7191935a3315afbc43865e3dfd2403c8e&lt;/p&gt;</comment>
                            <comment id="237395" author="adilger" created="Thu, 22 Nov 2018 18:02:35 +0000"  >&lt;p&gt;Patch from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9230&quot; title=&quot;soft lockup on v2.9 Lustre clients (ldlm?)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9230&quot;&gt;&lt;del&gt;LU-9230&lt;/del&gt;&lt;/a&gt; has resolved this issue.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="44868">LU-9230</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzz9r3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>