<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:35:00 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-17384] OOMkiller invoked on lustre OSS nodes under IOR</title>
                <link>https://jira.whamcloud.com/browse/LU-17384</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;During SWL for toss 4.6-6rc3 and also 4,7-2rc2, we found that an IOR run could trigger an OOM on an OSS node.&lt;/p&gt;

&lt;p&gt;We were able to reproduce this issue using IOR under srun.&lt;/p&gt;
&lt;h3&gt;&lt;a name=&quot;Thefollowingsrun%2Fiorcommandwasused%3A&quot;&gt;&lt;/a&gt;The following srun/ior command was used:&lt;/h3&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;srun -N 70 -n 7840 /g/g0/carbonne/ior/src/ior -a MPIIO -i 5 -b 256MB -t 128MB -v -g -F -C -w -W -r -o /p/lflood/carbonne/oomtest/ior_1532/ior
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Example at 2023-10-17 12:31:28 on garter5, see console log.&lt;/p&gt;

&lt;p&gt;Mem-info from one oom-killer console log message set is:&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Mem-Info:
active_anon:22868 inactive_anon:69168 isolated_anon:0
&#160;active_file:357 inactive_file:770 isolated_file:250
&#160;unevictable:10785 dirty:0 writeback:0
&#160;slab_reclaimable:185039 slab_unreclaimable:2082954
&#160;mapped:12536 shmem:46663 pagetables:2485 bounce:0
&#160;free:134668 free_pcp:203 free_cma:0

Node 0 active_anon:75888kB inactive_anon:87304kB active_file:1840kB
 inactive_file:1464kB  unevictable:43080kB isolated(anon):0kB
 isolated(file):208kB mapped:19680kB dirty:0kB writeback:0kB
 shmem:127712kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 26624kB
 writeback_tmp:0kB kernel_stack:31416kB pagetables:3896kB
 all_unreclaimable? no

Node 0 DMA free:11264kB min:4kB low:16kB high:28kB active_anon:0kB
 inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB
 writepending:0kB present: 15996kB managed:15360kB mlocked:0kB bounce:0kB
 free_pcp:0kB local_pcp:0kB free_cma:0kB
 lowmem_reserve[]: 0 1183 94839 94839 94839

Node 0 DMA32 free:375156kB min:556kB low:1764kB high:2972kB
 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:4kB
 unevictable:0kB writepending:0kB present:1723228kB managed:1325704kB
 mlocked:0kB bounce:0kB free_pcp:260kB local_pcp:0kB free_cma:0kB
 lowmem_reserve[]: 0 0 93655 93655 93655

Node 0 Normal free:46072kB min:44044kB low:139944kB high:235844kB
 active_anon:75888kB inactive_anon:87304kB active_file:1860kB
 inactive_file:1584kB unevictable: 43080kB writepending:0kB
 present:97517568kB managed:95912024kB mlocked:43080kB bounce:0kB
 free_pcp:372kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 0 0 0 0

Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11264kB

Node 0 DMA32: 3*4kB (M) 66*8kB (UM) 202*16kB (UM) 152*32kB (UM)
 168*64kB (UM) 85*128kB (UM) 24*256kB (UM) 20*512kB (UM) 11*1024kB (UM)
 7*2048kB (UM) 74*4096kB (# M) = 375356kB

Node 0 Normal: 151*4kB (MEH) 853*8kB (UMEH) 640*16kB (MEH) 412*32kB (MEH)
 132*64kB (ME) 33*128kB (UE) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 43524kB

Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB

Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB

Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB

Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB

53515 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap&#160; = 0kB
Total swap = 0kB
49980022 pages RAM
0 pages HighMem/MovableOnly
896433 pages reserved
0 pages hwpoisoned
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;=============================================================&lt;/p&gt;

&lt;p&gt;local Jira ticket:&#160; &lt;a href=&quot;https://lc.llnl.gov/jira/browse/TOSS-6158&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;TOSS-6158&lt;/a&gt;&lt;/p&gt;</description>
                <environment>Clients : Lustre 2.12&lt;br/&gt;
servers: Lustre 2.14 and 2.15 (both tested and reproduced)&lt;br/&gt;
toss.    : 4.6-6 and 4.7-2     (both reproduced)</environment>
        <key id="79713">LU-17384</key>
            <summary>OOMkiller invoked on lustre OSS nodes under IOR</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="pjones">Peter Jones</assignee>
                                    <reporter username="carbonneau">Eric Carbonneau</reporter>
                        <labels>
                            <label>llnl</label>
                            <label>topllnl</label>
                    </labels>
                <created>Fri, 22 Dec 2023 17:14:30 +0000</created>
                <updated>Fri, 9 Feb 2024 00:03:07 +0000</updated>
                                            <version>Lustre 2.14.0</version>
                    <version>Lustre 2.15.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="397937" author="pjones" created="Fri, 22 Dec 2023 19:24:42 +0000"  >&lt;p&gt;Eric&lt;/p&gt;

&lt;p&gt;There have been some recent changes merged to master for the upcoming 2.16 release that we think could well help address this problem. Could you please retry your reproducer against a master client. If that does indeed resolve the issue then we can look to what would need to be back ported to b2_15 in order to get the same benefit there&lt;/p&gt;

&lt;p&gt;Regards&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="397953" author="ofaaland" created="Fri, 22 Dec 2023 20:06:31 +0000"  >&lt;p&gt;Thanks, Peter.  In our case, when we reproduced this, the client was 2.12 and the server was 2.14 or 2.15.&lt;/p&gt;

&lt;p&gt;Are you saying client patches might fix this?  We&apos;re happy to test master clients, but I would think the server should be managing its memory usage without depending on the client to behave in a certain way.&lt;/p&gt;</comment>
                            <comment id="397961" author="JIRAUSER18802" created="Fri, 22 Dec 2023 21:51:38 +0000"  >&lt;p&gt;I forgot to mention the issue occurs during read operations on the OSS. During the write operations the OSS memory was consnt.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="397965" author="adilger" created="Sat, 23 Dec 2023 00:16:56 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=carbonneau&quot; class=&quot;user-hover&quot; rel=&quot;carbonneau&quot;&gt;carbonneau&lt;/a&gt;, &lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=ofaaland&quot; class=&quot;user-hover&quot; rel=&quot;ofaaland&quot;&gt;ofaaland&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;it would be useful to include the actual stack traces from the OSS when the OOM is hit, not just the meminfo.&#160; Otherwise it is difficult to know what is actually allocating the memory.&#160; Sometimes it is just an innocent bystander process, but in many cases the actual offender is caught because it is the one allocating memory the most frequently...&lt;/p&gt;</comment>
                            <comment id="397966" author="adilger" created="Sat, 23 Dec 2023 00:35:55 +0000"  >&lt;p&gt;Originally I thought this was related to cgroups, which is a client side issue, but I didn&apos;t notice the &quot;OSS&quot; in the summary.&lt;/p&gt;

&lt;p&gt;The majority of memory usage looks to be in &quot;&lt;tt&gt;slab_reclaimable:185039 slab_unreclaimable:2082954&lt;/tt&gt;&quot; or at least I can&apos;t see anything else reported in the meminfo dump.&#160; Are you able to capture &lt;tt&gt;/proc/slabinfo&lt;/tt&gt; or &lt;tt&gt;slabtop&lt;/tt&gt; from the OSS while the IOR is running, and see what is using the majority of memory?&lt;/p&gt;

&lt;p&gt;This &lt;em&gt;might&lt;/em&gt; relate to the use of deferred fput on the server, which can accumulate over time if the server was running a long time?&#160;There were two recent patches related to this that landed on master, but these may only be relevant for osd-ldiskfs and not osd-zfs (which I assume is the case here):&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;&lt;a href=&quot;https://review.whamcloud.com/51731&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/51731&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16973&quot; title=&quot;Busy device after successful umount&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16973&quot;&gt;&lt;del&gt;LU-16973&lt;/del&gt;&lt;/a&gt; osd: adds SB_KERNMOUNT flag&lt;/tt&gt;&quot;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;https://review.whamcloud.com/51805&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/51805&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16973&quot; title=&quot;Busy device after successful umount&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16973&quot;&gt;&lt;del&gt;LU-16973&lt;/del&gt;&lt;/a&gt; ptlrpc: flush delayed file desc if idle&lt;/tt&gt;&quot;&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="403316" author="JIRAUSER18802" created="Fri, 9 Feb 2024 00:03:07 +0000"  >&lt;p&gt;We&apos;ve done more testing and gathered more information for your review:&lt;/p&gt;

&lt;p&gt;As a starter&#160; Version of ZFS and Lustre required to reproduce the OOM:&lt;/p&gt;

&lt;p&gt;ZFS version.&#160; &#160;: 2.1.14_1llnl-1&lt;br/&gt;
Lustre Version: lustre-2.15.4_1.llnl&lt;br/&gt;
Ram total available:  187 Gib&lt;/p&gt;

&lt;p&gt;FIRST RUN:&lt;/p&gt;

&lt;p&gt;zfs_arc_max was set to default: 0&lt;/p&gt;

&lt;p&gt;I also set the kernel to slab_nomerge to pinpoint the culprit slab if any.&lt;br/&gt;
There was no culprit slab found as nothing was jumping at the top of the in slabtop. &lt;br/&gt;
But we also check arcstat to see what was going on in the arc during testing.&lt;/p&gt;

&lt;p&gt;command used arcstat 1:&lt;/p&gt;

&lt;p&gt; time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  size     c  avail&lt;br/&gt;
11:29:21   19K  6.8K     34   569    4  6.3K   99     0    0   93G   93G    25G&lt;br/&gt;
11:29:22   19K  6.9K     34   550    4  6.3K   99     0    0   93G   93G    25G&lt;br/&gt;
11:29:23   20K  7.0K     34   546    3  6.5K   99     0    0   93G   93G    25G&lt;br/&gt;
11:29:24   20K  7.2K     34   600    4  6.6K   99     0    0   93G   93G    25G&lt;br/&gt;
11:29:25   21K  7.5K     35   633    4  6.9K   99     0    0   94G   93G    24G&lt;br/&gt;
11:29:26   21K  7.6K     34   602    4  7.0K   99     0    0   93G   93G    25G&lt;br/&gt;
11:29:27   21K  7.8K     36   633    4  7.1K   99     0    0   94G   93G    25G&lt;br/&gt;
11:29:28   22K  8.0K     35   616    4  7.4K   99     0    0   94G   93G    24G&lt;br/&gt;
11:29:29   23K  9.0K     38  1.2K    7  7.8K   99   503    3   94G   93G    24G&lt;br/&gt;
11:29:30   22K   10K     45  2.7K   17  7.8K   99  2.0K   13   95G   93G    23G&lt;br/&gt;
11:29:31   23K   10K     46  2.7K   17  8.2K  100  2.0K   13   97G   93G    21G&lt;br/&gt;
11:29:32   24K   11K     46  2.8K   17  8.3K  100  2.1K   13   99G   93G    20G&lt;br/&gt;
11:29:33   24K   11K     46  2.8K   17  8.8K  100  2.1K   13  101G   93G    18G&lt;br/&gt;
11:29:34   24K   11K     46  2.7K   17  8.7K  100  2.0K   13  103G   93G    16G&lt;br/&gt;
11:29:35   26K   12K     47  2.9K   17  9.5K  100  2.1K   13  105G   93G    13G&lt;br/&gt;
11:29:36   26K   12K     47  2.8K   16  9.6K  100  2.0K   12  108G   93G    11G&lt;br/&gt;
11:29:37   26K   12K     47  2.8K   16  9.7K  100  2.0K   12  110G   93G   8.7G&lt;br/&gt;
11:29:38   27K   13K     47  2.8K   16   10K  100  2.0K   12  113G   93G   5.9G&lt;br/&gt;
11:29:39   27K   13K     47  2.9K   16   10K  100  2.0K   12  116G   93G   3.1G&lt;br/&gt;
11:29:40   27K   13K     48  2.7K   16   10K  100  1.9K   12  118G   93G    10M&lt;br/&gt;
11:29:41   41K   15K     37  5.7K   17  9.9K  100  4.9K   15  121G   93G  -3.6G&lt;/p&gt;

&lt;p&gt;At that point we were OOMed.&lt;br/&gt;
--------------------------------------------------------------------------------------------------------------------------&lt;/p&gt;

&lt;p&gt;SECOND RUN:&lt;/p&gt;

&lt;p&gt;for the second run we set the zfs_arc_max to 47 Gib&lt;/p&gt;

&lt;p&gt;keeping monitoring the arcstat we can see it going right through the limit set to 47Gib:&lt;/p&gt;

&lt;p&gt;arcstat 1:&lt;/p&gt;

&lt;p&gt;    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  size     c  avail&lt;br/&gt;
11:57:19   19K  9.0K     47  3.1K   23  5.9K  100  2.5K   20   76G   47G    41G&lt;br/&gt;
11:57:20   19K  9.1K     47  3.1K   23  6.1K  100  2.5K   19   77G   47G    40G&lt;br/&gt;
11:57:21   20K  9.4K     47  3.0K   22  6.4K  100  2.4K   18   78G   47G    39G&lt;br/&gt;
11:57:22   20K  9.6K     47  3.1K   22  6.5K  100  2.5K   18   79G   47G    38G&lt;br/&gt;
11:57:23   21K  10.0K     47  3.2K   22  6.8K  100  2.5K   18   80G   47G    37G&lt;br/&gt;
11:57:24   21K  10.0K     47  3.1K   21  6.9K  100  2.4K   17   81G   47G    36G&lt;br/&gt;
11:57:25   21K   10K     47  3.2K   21  7.2K  100  2.5K   18   82G   47G    35G&lt;br/&gt;
11:57:26   22K   10K     47  3.2K   21  7.4K  100  2.5K   17   84G   47G    34G&lt;br/&gt;
11:57:27   22K   10K     47  3.0K   20  7.8K  100  2.4K   16   85G   47G    32G&lt;br/&gt;
11:57:28   23K   11K     47  3.2K   20  8.0K  100  2.5K   16   87G   47G    30G&lt;br/&gt;
11:57:29   23K   11K     47  3.1K   19  8.3K  100  2.4K   16   89G   47G    29G&lt;br/&gt;
11:57:30   24K   11K     48  3.2K   20  8.7K  100  2.5K   16   91G   47G    27G&lt;br/&gt;
11:57:31   24K   12K     48  3.2K   19  8.9K  100  2.5K   16   93G   47G    24G&lt;br/&gt;
11:57:32   25K   12K     48  3.1K   18  9.4K  100  2.4K   15   95G   47G    22G&lt;br/&gt;
11:57:33   26K   12K     48  3.2K   19  9.6K  100  2.4K   15   98G   47G    20G&lt;br/&gt;
11:57:34   26K   13K     49  3.2K   19  9.9K  100  2.5K   15  100G   47G    17G&lt;br/&gt;
11:57:35   27K   13K     49  3.3K   18   10K  100  2.5K   15  103G   47G    14G&lt;br/&gt;
11:57:36   28K   14K     49  3.4K   19   10K  100  2.6K   15  106G   47G    11G&lt;br/&gt;
11:57:37   28K   14K     49  3.3K   18   11K  100  2.5K   14  109G   47G   8.4G&lt;br/&gt;
11:57:38   29K   14K     50  3.3K   18   11K  100  2.6K   14  113G   47G   5.0G&lt;br/&gt;
11:57:39   30K   15K     50  3.3K   17   11K  100  2.5K   14  116G   47G   669M&lt;br/&gt;
11:57:40   44K   17K     39  6.1K   18   11K  100  5.3K   16  120G   47G  -3.9G&lt;/p&gt;

&lt;p&gt;------------------------------------------------------------------------------------------------------------&lt;/p&gt;

&lt;p&gt;I will look into zfs with our zfs developers and update ticket.&lt;/p&gt;



&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10030" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic/Theme</customfieldname>
                        <customfieldvalues>
                                        <label>OSS</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i045pz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10023"><![CDATA[4]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>