<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:49:12 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-5179] Reading files from lustre results in stuck anonymous memory when JOBID is enabled</title>
                <link>https://jira.whamcloud.com/browse/LU-5179</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We have been seeing our SLES11SP2 and SLES11SP3 clients have stuck anonymous memory that cannot be cleared up without a reboot.  We have three test cases which can replicate the problem reliably.  We have been able to replicate the problem on different clients on all of our lustre file systems.  We have not been able to reproduce the problem when using NFS, ext3, CXFS, or tmpfs.&lt;/p&gt;

&lt;p&gt;We have been working with SGI on tracking down this problem.  Unfortunately, they have been unable to reproduce the problem on their systems.  On our systems, they have simplified the test case to mmaping a file along with an equally sized anonymous region, and reading the contents of the mmaped file into the anonymous mmaped region.  This test case can be provided to see if you can reproduce this problem.&lt;/p&gt;

&lt;p&gt;To determine if the problem is occurring, reboot the system to ensure that memory is clean.  Check /proc/meminfo for the amount of Active(anon) memory being used.  Run the test case.  During the test case, the amount of anonymous memory will increase.  At the end of the test case, it would be expected for the amount to drop back to pre-test case levels.  &lt;/p&gt;

&lt;p&gt;To confirm that the anonymous memory is stuck, we have been using memhog to attempt to allocate memory.  If the node has 32Gb of memory, with 2Gb of anonymous memory used, we attempt to allocate 31Gb of memory.  If memhog completes and you then have only 1Gb of anonymous memory, you have not reproduced the problem.  If memhog is killed, you have.&lt;/p&gt;

&lt;p&gt;SGI would like to get information about how to get debug information to track down this problem.&lt;/p&gt;</description>
                <environment>Clients:&lt;br/&gt;
Endeavour:  2.4.3, ldan: 2.4.1, Pleiades compute nodes: 2.1.5 or 2.4.1&lt;br/&gt;
Servers:&lt;br/&gt;
2.1.5, 2.4.1, 2.4.3</environment>
        <key id="25116">LU-5179</key>
            <summary>Reading files from lustre results in stuck anonymous memory when JOBID is enabled</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="green">Oleg Drokin</assignee>
                                    <reporter username="hyeung">Herbert Yeung</reporter>
                        <labels>
                            <label>HB</label>
                    </labels>
                <created>Thu, 12 Jun 2014 01:57:19 +0000</created>
                <updated>Tue, 12 Aug 2014 19:46:40 +0000</updated>
                            <resolved>Fri, 20 Jun 2014 18:58:00 +0000</resolved>
                                    <version>Lustre 2.5.0</version>
                    <version>Lustre 2.4.3</version>
                                    <fixVersion>Lustre 2.6.0</fixVersion>
                    <fixVersion>Lustre 2.5.3</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>10</watches>
                                                                            <comments>
                            <comment id="86401" author="green" created="Thu, 12 Jun 2014 02:38:12 +0000"  >&lt;p&gt;Please upload your testcase.&lt;/p&gt;

&lt;p&gt;You list multiple client node versions, is this observed in all of them?&lt;/p&gt;</comment>
                            <comment id="86445" author="hyeung" created="Thu, 12 Jun 2014 17:54:28 +0000"  >&lt;p&gt;Yes, we have been able to reproduce the problem on all of the clients.  Though I am not positive that we have tested every permutation, I believe that all of the lustre versions have been reproduced on all of the clients.&lt;/p&gt;</comment>
                            <comment id="86484" author="green" created="Thu, 12 Jun 2014 21:22:52 +0000"  >&lt;p&gt;From the call:This was also reproduced on rhel6.4 kernel with the same testcase.&lt;br/&gt;
unmounting the fs either fails (fs busy) or when succeeding, does not free memory either.&lt;/p&gt;

&lt;p&gt;The reproducer only works on NASA systems and it&apos;s not 100% reliable, but a high frequency is still reported.&lt;/p&gt;

&lt;p&gt;This originally was investigated from strange OOM issues.&lt;/p&gt;</comment>
                            <comment id="86514" author="hyeung" created="Fri, 13 Jun 2014 01:56:51 +0000"  >&lt;p&gt;Second test case that reproduces the problem.&lt;/p&gt;</comment>
                            <comment id="86516" author="hyeung" created="Fri, 13 Jun 2014 02:16:11 +0000"  >&lt;p&gt;Normally, we run the test script through PBS.  To do so, you can use My_run.ivy.  If you want to run interactively or without PBS, use My_run_I_ivy.  Sample output from PBS is at kdgordon.o328785, though some of the data has been sanitized.&lt;/p&gt;

&lt;p&gt;The script prints out a variety of information, including the amount of anonymous memory being used by the system.  The runit script is called which produces the stuck anonymous memory.  After that, the amount of used anonymous memory is checked. memhog is called to try to clear up the memory and the amount of anonymous memory is displayed again.&lt;/p&gt;

&lt;p&gt;You will need to tweak the amount of memory that memhog attempts allocate based on how much memory your test system has.  Generally, about 800MB of memory gets stuck after running the job.  runit can be called several times to increase the amount of stuck memory.&lt;/p&gt;

&lt;p&gt;After running the test, the file checkpoint.mtcp is created.  On some of our systems, the file may need to be deleted before being able to reproduce the problem again.&lt;/p&gt;

&lt;p&gt;After running memhog, the memory can go to swap instead and remains there used.  This still indicates that the memory is not being freed.&lt;/p&gt;</comment>
                            <comment id="86631" author="green" created="Fri, 13 Jun 2014 23:36:56 +0000"  >&lt;p&gt;I would just like to note that there&apos;s no source to the timeout and test_ckpt so it&apos;s kind of hard to see what they are doing.&lt;/p&gt;</comment>
                            <comment id="86720" author="hyeung" created="Mon, 16 Jun 2014 19:02:45 +0000"  >&lt;p&gt;Source for test_ckpt uploaded.  timeout comes from coreutils-8.22&lt;/p&gt;</comment>
                            <comment id="87066" author="jay" created="Thu, 19 Jun 2014 17:57:22 +0000"  >&lt;p&gt;can you show me the output of /proc/meminfo when you see this problem? &lt;/p&gt;</comment>
                            <comment id="87073" author="jaylan" created="Thu, 19 Jun 2014 18:19:55 +0000"  >&lt;p&gt;Below is an exmaple of a system that stuck in this memory. The system has 4TB memry and 1.5TB stuck in Acitve(anon) that can not be released. There are 126 nodes in that systems and application would request a number of nodes for their testing. After the memory leak, those nodes would have not enough memory for other jobs. The application would fail and resubmision of the job would then fail to start because the requested notes did not have enough memory.&lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;cat /proc/meminfo&lt;br/&gt;
MemTotal:       4036524872 kB&lt;br/&gt;
MemFree:        2399583516 kB&lt;br/&gt;
Buffers:          243504 kB&lt;br/&gt;
Cached:          5204560 kB&lt;br/&gt;
SwapCached:       678520 kB&lt;br/&gt;
Active:         1544908812 kB&lt;br/&gt;
Inactive:       56619188 kB&lt;br/&gt;
Active(anon):   1543105636 kB&lt;br/&gt;
Inactive(anon): 53018772 kB&lt;br/&gt;
Active(file):    1803176 kB&lt;br/&gt;
Inactive(file):  3600416 kB&lt;br/&gt;
Unevictable:           0 kB&lt;br/&gt;
Mlocked:               0 kB&lt;br/&gt;
SwapTotal:      10239996 kB&lt;br/&gt;
SwapFree:              0 kB&lt;br/&gt;
Dirty:            554504 kB&lt;br/&gt;
Writeback:         26128 kB&lt;br/&gt;
AnonPages:      1595359296 kB&lt;br/&gt;
Mapped:           143708 kB&lt;br/&gt;
Shmem:             98772 kB&lt;br/&gt;
Slab:           11485844 kB&lt;br/&gt;
SReclaimable:     161660 kB&lt;br/&gt;
SUnreclaim:     11324184 kB&lt;br/&gt;
KernelStack:       87016 kB&lt;br/&gt;
PageTables:      6747560 kB&lt;br/&gt;
NFS_Unstable:      31856 kB&lt;br/&gt;
Bounce:                0 kB&lt;br/&gt;
WritebackTmp:          0 kB&lt;br/&gt;
CommitLimit:    2027403680 kB&lt;br/&gt;
Committed_AS:   1262704572 kB&lt;br/&gt;
VmallocTotal:   34359738367 kB&lt;br/&gt;
VmallocUsed:    18824052 kB&lt;br/&gt;
VmallocChunk:   25954600480 kB&lt;br/&gt;
HardwareCorrupted:     0 kB&lt;br/&gt;
AnonHugePages:  1264787456 kB&lt;br/&gt;
HugePages_Total:    1073&lt;br/&gt;
HugePages_Free:      468&lt;br/&gt;
HugePages_Rsvd:      468&lt;br/&gt;
HugePages_Surp:     1073&lt;br/&gt;
Hugepagesize:       2048 kB&lt;br/&gt;
DirectMap4k:      335872 kB&lt;br/&gt;
DirectMap2M:    134963200 kB&lt;br/&gt;
DirectMap1G:    3958374400 kB&lt;/li&gt;
&lt;/ol&gt;
</comment>
                            <comment id="87082" author="jay" created="Thu, 19 Jun 2014 20:02:41 +0000"  >&lt;p&gt;just to narrow down the problem, will you please try it again with huge page disabled?&lt;/p&gt;

&lt;p&gt;I really meant to say transparent huge page.&lt;/p&gt;</comment>
                            <comment id="87087" author="yobbo" created="Thu, 19 Jun 2014 20:14:31 +0000"  >&lt;p&gt;ldan2 and ldan3 below are fairly ordinary self-contained systems running&lt;br/&gt;
Lustre 2.4.1-6nas_ofed154 client.  This problem has been reproduced on several versions of the NAS Lustre client and server software.&lt;/p&gt;

&lt;p&gt;Log into ldan2.   (I&apos;ve mostly used a qsub session, but I have reproduced the problem outside of PBS)&lt;/p&gt;

&lt;p&gt;Log into ldan3.  ( I have special permission to log into an ldan I don&apos;t&lt;br/&gt;
                have a PBS job running on)&lt;/p&gt;

&lt;p&gt;On both systems cd to the test (lustre) directory&lt;br/&gt;
In this directory exist the following:&lt;br/&gt;
1. a copy of hedi&apos;s test, I&apos;ve tested with the first one he wrote and a fairly&lt;br/&gt;
late version(mmap4.c).  The later version (attached) is more flexible.&lt;/p&gt;

&lt;p&gt;2. a 1g file created with:&lt;br/&gt;
dd if=/dev/zero of=1g bs=4096 count=262144&lt;/p&gt;

&lt;p&gt;#&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;set up your favorite Anonymous Memory monitor on ldan2&lt;/li&gt;
	&lt;li&gt;I&apos;ve been using nodeinfo in a seperate window&lt;br/&gt;
#&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;On ldan3 run:&lt;br/&gt;
dd count=1 bs=1 conv=notrunc of=1g if=/dev/zero&lt;/p&gt;

&lt;p&gt;On ldan2 run:&lt;br/&gt;
./mmap 1g&lt;/p&gt;

&lt;p&gt;After the mmap program terminates, notice that the anonymous memory&lt;br/&gt;
used by mmap remains in memory.   I&apos;ve never been able to force &lt;br/&gt;
persistent Anonymous memory out of the greater Linux virtual memory&lt;br/&gt;
system (memory + swap). Anonymous memory swapped to disk is not&lt;br/&gt;
accounted for as &quot;Anonymous Memory&quot;, but it is accounted for as swap.&lt;/p&gt;

&lt;p&gt;The problem does not reproduce if the dd &quot;interference&quot; is not run.&lt;br/&gt;
The problem does not reproduce if the dd &quot;interference&quot; is run and then&lt;br/&gt;
lflush is run, or the file is read (w/o mmap) from a third system.  The problem does not reproduce if the dd &quot;interference&quot; is run on ldan2, then the mmap test is run on ldan2.&lt;/p&gt;</comment>
                            <comment id="87090" author="jay" created="Thu, 19 Jun 2014 20:21:58 +0000"  >&lt;p&gt;we may have found the problem, Oleg will push a patch soon.&lt;/p&gt;</comment>
                            <comment id="87092" author="green" created="Thu, 19 Jun 2014 20:24:09 +0000"  >&lt;p&gt;I did audit of all mmput calls in the code, there are only two functions that use it.&lt;br/&gt;
One of them has a mm_struct leak that would introduce symptoms very similar to what you see.&lt;/p&gt;

&lt;p&gt;My proposed patch is at &lt;a href=&quot;http://review.whamcloud.com/10759&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10759&lt;/a&gt; please give it a try&lt;/p&gt;</comment>
                            <comment id="87096" author="green" created="Thu, 19 Jun 2014 20:39:17 +0000"  >&lt;p&gt;Also additional info.&lt;br/&gt;
For the problem to manifest you must have jobid functionality enabled in the &quot;read env variable&quot; mode. Once I enabled that, I can immediately reproduce and so I tested the patch and it fixes the problem for me.&lt;/p&gt;</comment>
                            <comment id="87098" author="jaylan" created="Thu, 19 Jun 2014 20:45:00 +0000"  >&lt;p&gt;The patch affects both server and client. If I only update the client, would it solve the problem at the client side?&lt;/p&gt;</comment>
                            <comment id="87099" author="jay" created="Thu, 19 Jun 2014 20:47:46 +0000"  >&lt;p&gt;yes, applying it on the client side only will fix the problem.&lt;/p&gt;</comment>
                            <comment id="87105" author="yobbo" created="Thu, 19 Jun 2014 23:06:49 +0000"  >&lt;p&gt;Jay built a copy of the client for the ldan test system.  Initial testing is positive, this fixes the test cases reported in this LU on the type of system I am using to test.&lt;/p&gt;</comment>
                            <comment id="87198" author="jlevi" created="Fri, 20 Jun 2014 18:58:00 +0000"  >&lt;p&gt;Patch landed to Master. Please reopen ticket if more work is needed.&lt;/p&gt;</comment>
                            <comment id="87673" author="spimpale" created="Fri, 27 Jun 2014 10:38:24 +0000"  >&lt;p&gt;b2_4 backport: &lt;a href=&quot;http://review.whamcloud.com/10868&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10868&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="15148" name="T2.tgz" size="207" author="hyeung" created="Fri, 13 Jun 2014 01:56:51 +0000"/>
                            <attachment id="15145" name="mmap.c" size="1843" author="hyeung" created="Thu, 12 Jun 2014 17:53:28 +0000"/>
                            <attachment id="15206" name="mmap4.c" size="1846" author="yobbo" created="Thu, 19 Jun 2014 20:14:31 +0000"/>
                            <attachment id="15154" name="test_ckpt.cpp" size="1190" author="hyeung" created="Mon, 16 Jun 2014 19:02:12 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwo8n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>14374</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>