<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:58:35 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6250] slow down of processing, cache related</title>
                <link>https://jira.whamcloud.com/browse/LU-6250</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;On our compute cluster we are seeing the issue that occasionally a number of the standard processes are way slower than normal. (taking &amp;gt;90s instead of normally &amp;lt;10s). &lt;br/&gt;
The affected processes are reading and writing files on lustre, we have never seen them slow down this much when using other file systems. While they are slow all the CPU cycles are in system time.&lt;/p&gt;

&lt;p&gt;If the processes are slow on a particular node, they usually are slow on that node until we do manual intervention. Currently the manual intervention involves dropping all caches, which takes &amp;gt;90s on our nodes with 24GB nodes. Reducing the maximum amount that Lustre is allowed to cache to ensure there is always free memory doesn&apos;t improve the situation at all.&lt;/p&gt;

&lt;p&gt;We have noticed &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1784&quot; title=&quot;freeing cached clean pages is slow&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1784&quot;&gt;LU-1784&lt;/a&gt; which might be related? The ticket seems to suggest nothing has changed there recently, is this correct?&lt;/p&gt;

&lt;p&gt;We have had quite a bit success in making the application slow by copying many files and many GB data from lustre to lustre until the swap is (nearly) full (as seen in /proc/fs/lustre/llite/*/max_cached_mb), but even this isn&apos;t always an indicator.&lt;/p&gt;

&lt;p&gt;We are still working on a suitable reproducer or at least test case that we can share. All we currently have involves custom software which we can&apos;t distribute.&lt;/p&gt;

&lt;p&gt;While we are looking for a reproducer, we&apos;d also be interested if there are any additional things we might want to watch out for on the Lustre side, any profiling to be done or any other data to gather.&lt;/p&gt;</description>
                <environment></environment>
        <key id="28712">LU-6250</key>
            <summary>slow down of processing, cache related</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="ferner">Frederik Ferner</reporter>
                        <labels>
                    </labels>
                <created>Mon, 16 Feb 2015 16:00:54 +0000</created>
                <updated>Tue, 7 Jun 2016 15:38:27 +0000</updated>
                                            <version>Lustre 2.5.2</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="107084" author="pjones" created="Mon, 16 Feb 2015 16:21:49 +0000"  >&lt;p&gt;Bobijam&lt;/p&gt;

&lt;p&gt;Could you please advise on this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="107134" author="adilger" created="Tue, 17 Feb 2015 18:18:15 +0000"  >&lt;p&gt;Frederik, you wrote:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We have had quite a bit success in making the application slow by copying many files and many GB data from lustre to lustre until the swap is (nearly) full (as seen in /proc/fs/lustre/llite/*/max_cached_mb), but even this isn&apos;t always an indicator.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;do you actually mean &quot;&lt;em&gt;swap&lt;/em&gt; is (nearly) full&quot; or &quot;&lt;em&gt;cache&lt;/em&gt; is (nearly) full&quot;?  The kernel shouldn&apos;t swap out any pages from data files to the swap device, only pages with allocated variables from user executables.  The kernel-internal memory cannot be swapped out either.&lt;/p&gt;

&lt;p&gt;Could you please add the contents of &lt;tt&gt;/proc/slabinfo&lt;/tt&gt; and &lt;tt&gt;/proc/meminfo&lt;/tt&gt; into the bug so we can see where the memory is being used.&lt;/p&gt;</comment>
                            <comment id="107164" author="ferner" created="Tue, 17 Feb 2015 19:44:29 +0000"  >&lt;p&gt;Andreas,&lt;/p&gt;

&lt;p&gt;thanks for catching my mistake. I did mean &quot;&lt;b&gt;cache&lt;/b&gt; is (nearly) full&quot;. These nodes use only very little swap, if any.&lt;/p&gt;

&lt;p&gt;I&apos;ll add &lt;tt&gt;/proc/slabinfo&lt;/tt&gt; and &lt;tt&gt;/proc/meminfo&lt;/tt&gt; for in the slow state next time we manage to reproduce it.&lt;/p&gt;
</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10490" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>End date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 17 Feb 2015 16:00:54 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzx6ef:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>17503</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10493" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>Start date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 16 Feb 2015 16:00:54 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    </customfields>
    </item>
</channel>
</rss>