<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:29:22 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-16713] Writeback and commit pages under memory pressure to avoid OOM</title>
                <link>https://jira.whamcloud.com/browse/LU-16713</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We&apos;ve tried to solve this in the past by integrating NFS unstable pages tracking in to Lustre, but this is fraught - it treats our uncommitted pages as dirty, which means we get rate limited on them.  The kernels idea of an appropriate number of outstanding pages is based on local file systems, and isn&apos;t enough for us, so this causes performance issues.  The SOFT_SYNC feature we created to work with unstable pages also just asks the OST nicely to do a commit, and includes no way for the client to be notified quickly.&lt;br/&gt;
This means it can&apos;t be responsive enough to avoid tasks getting OOM-killed.&lt;/p&gt;

&lt;p&gt;Linux kernel already has matured solution for OOM with cgroup.&lt;br/&gt;
The most related codes are in balance_dirty_pages:&lt;br/&gt;
If the dirtied and uncommitted pages are over &quot;background_thresh&quot; for global memory limitation and memory cgroup limitation, the write back threads are woken to perform some whiteout.&lt;br/&gt;
In this ticket, we give a solution similar to NFS:&lt;br/&gt;
In the completion of writeback for the dirtied pages (@brw_interpret), __mark_inode_dirty(), which will attach the @bdi_writeback (each memory cgroup can have its own bdi_writeback) to the inode.&lt;br/&gt;
Once the writeback threads is woken up, and @for_background is set, it will check whether @wb_over_bg_thresh. For background writeout, stop when we are below the background dirty threshold.&lt;br/&gt;
So what we should do in Lustre client is:&lt;br/&gt;
When writeback thread for background cals ll_writepages() to write out data, If the inode has dirtied pending pages, flush dirtied pages to OST and sync them to commit the unlined pages. If all pages has cleared dirtied flags, but still in unstable (uncommitted) state, we should send a dedicated sync  RPC to the OST and thus the uncommitted pages will be released finally.&lt;/p&gt;

&lt;p&gt;As unstable page account in kernel may have bad impact on the performance, thus we need to optimize the unstable page account code in next phase work.&lt;/p&gt;</description>
                <environment></environment>
        <key id="75440">LU-16713</key>
            <summary>Writeback and commit pages under memory pressure to avoid OOM</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="qian_wc">Qian Yingjin</assignee>
                                    <reporter username="qian_wc">Qian Yingjin</reporter>
                        <labels>
                    </labels>
                <created>Wed, 5 Apr 2023 14:13:15 +0000</created>
                <updated>Fri, 27 Oct 2023 21:44:28 +0000</updated>
                            <resolved>Tue, 26 Sep 2023 14:36:06 +0000</resolved>
                                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="368543" author="gerrit" created="Wed, 5 Apr 2023 14:22:26 +0000"  >&lt;p&gt;&quot;Qian Yingjin &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/50544&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/50544&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16713&quot; title=&quot;Writeback and commit pages under memory pressure to avoid OOM&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16713&quot;&gt;&lt;del&gt;LU-16713&lt;/del&gt;&lt;/a&gt; llite: writeback/commit pages under memory pressure&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 94a8579c83bacaad24866a8f62b5372189cc8241&lt;/p&gt;</comment>
                            <comment id="369223" author="qian_wc" created="Wed, 12 Apr 2023 09:45:50 +0000"  >&lt;p&gt;Some benchmark results:&lt;/p&gt;

&lt;p&gt;Total memory: 512G&lt;/p&gt;

&lt;p&gt;a. without memcg limits:&lt;/p&gt;

&lt;p&gt;stripe_count: 1&lt;/p&gt;

&lt;p&gt;cmd:&#160;&lt;/p&gt;

&lt;p&gt;dd if=/dev/zero of=test bs=1M count=$size&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;
&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;IO size&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;128G&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;256G&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;512G&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;1024G&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;master&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.2 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.2 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.1 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.0 GB/s&#160;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;w/ patch&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.2 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.2 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.1 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.0 GB/s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;b. with memcg limits on the patched master:&lt;/p&gt;

&lt;p&gt;stripe_count: 1&lt;/p&gt;

&lt;p&gt;cmd:&#160;&lt;/p&gt;

&lt;p&gt;bash -c &quot;echo \$$ &amp;gt; $cgdir/tasks &amp;amp;&amp;amp; dd if=/dev/zero of=$DIR/$tfile bs=1M count=$((memlimit_mb * time))&quot;&lt;/p&gt;

&lt;p&gt;io_size = $time X $memlimit_mb ==&amp;gt; $time = {2, 1, 0.5}&lt;/p&gt;
&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;memcg limits&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;1G&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;2G&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;4G&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;8G&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;16G&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;32G&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;64G&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2 X memlimit&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.7 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.7 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.6 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.7 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.8 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.8 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.7 GB/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1 X memlimit&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.9 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.9 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.9 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;1.9 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.2 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.2 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.2 GB/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;0.5 X memlimit&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.3 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.3 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.3 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.2 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.2 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.2 GB/s&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;2.3 GB/s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;The performance have no obvious degradation with memcg limits.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;br/&gt;
&#160;&lt;br/&gt;
&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="369263" author="qian_wc" created="Wed, 12 Apr 2023 14:59:15 +0000"  >&lt;p&gt;Multiple cgroups testing results (dd writes performance):&lt;/p&gt;

&lt;p&gt;The test scripts:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
error() {
        echo &lt;span class=&quot;code-quote&quot;&gt;&quot;$@&quot;&lt;/span&gt;
        exit 1
}

DIR=&lt;span class=&quot;code-quote&quot;&gt;&quot;/exafs&quot;&lt;/span&gt;
tdir=&lt;span class=&quot;code-quote&quot;&gt;&quot;milti&quot;&lt;/span&gt;
tfile=&lt;span class=&quot;code-quote&quot;&gt;&quot;test&quot;&lt;/span&gt;
dir=$DIR/$tdir
file=$dir/$tfile
cg_basedir=/sys/fs/cgroup/memory
cgdir=$cg_basedir/$tfile
memlimit_mb=$1
cnt=$2
declare -a pids

rm -rf $dir
sleep 2
mkdir $dir || error &lt;span class=&quot;code-quote&quot;&gt;&quot;failed to mkdir $dir&quot;&lt;/span&gt;

&lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; i in $(seq 1 $cnt); &lt;span class=&quot;code-keyword&quot;&gt;do&lt;/span&gt;
        cgdir=$cg_basedir/${tfile}.$i
        mkdir $cgdir || error &lt;span class=&quot;code-quote&quot;&gt;&quot;failed to mkdir $cgdir&quot;&lt;/span&gt;
        echo $((memlimit_mb * 1024 * 1024)) &amp;gt; $cgdir/memory.limit_in_bytes
        cat $cgdir/memory.limit_in_bytes
done

echo 3 &amp;gt; /proc/sys/vm/drop_caches
&lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; i in $(seq 1 $cnt); &lt;span class=&quot;code-keyword&quot;&gt;do&lt;/span&gt;
        cgdir=$cg_basedir/$tfile.$i
        (
        bash -c &lt;span class=&quot;code-quote&quot;&gt;&quot;echo \$$ &amp;gt; $cgdir/tasks &amp;amp;&amp;amp; dd &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt;=/dev/zero of=$dir/${tfile}.$i bs=1M count=$((memlimit_mb * 2))&quot;&lt;/span&gt;
        )&amp;amp;
        pids[i]=$!
done

&lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; i in $(seq 1 $cnt); &lt;span class=&quot;code-keyword&quot;&gt;do&lt;/span&gt;
        wait ${pids[$i]}
        cgdir=$cg_basedir/$tfile.$i
        rmdir $cg_basedir/${tfile}.$i || error &lt;span class=&quot;code-quote&quot;&gt;&quot;failed to rm $cgdir&quot;&lt;/span&gt;
done

wait
sleep 3
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Results:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
CMD: ./tmult.sh $memlimit_mb $cgcnt
==== 4 cgroups ====

[root@ice01 scripts]# ./tmult.sh 1024 4
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.47427 s, 1.5 GB/s
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.49274 s, 1.4 GB/s
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.49886 s, 1.4 GB/s
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.52199 s, 1.4 GB/s

[root@ice01 scripts]# ./tmult.sh 2048 4
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 2.93491 s, 1.5 GB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 2.94163 s, 1.5 GB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 2.94337 s, 1.5 GB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 2.97721 s, 1.4 GB/s

[root@ice01 scripts]# ./tmult.sh 4096 4
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 5.7354 s, 1.5 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 5.87343 s, 1.5 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 5.95922 s, 1.4 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 5.99732 s, 1.4 GB/s

[root@ice01 scripts]# ./tmult.sh 8192 4
17179869184 bytes (17 GB, 16 GiB) copied, 11.7261 s, 1.5 GB/s
17179869184 bytes (17 GB, 16 GiB) copied, 11.8024 s, 1.5 GB/s
17179869184 bytes (17 GB, 16 GiB) copied, 11.8868 s, 1.4 GB/s
17179869184 bytes (17 GB, 16 GiB) copied, 11.9072 s, 1.4 GB/s

===== 8 cgroups ====

[root@ice01 scripts]# ./tmult.sh 1024 8
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.68561 s, 1.3 GB/s
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.69721 s, 1.3 GB/s
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.70013 s, 1.3 GB/s
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.71561 s, 1.3 GB/s
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.71978 s, 1.2 GB/s
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.74053 s, 1.2 GB/s
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.76275 s, 1.2 GB/s
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.87241 s, 1.1 GB/s

[root@ice01 scripts]# ./tmult.sh 2048 8
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.40484 s, 1.3 GB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.46257 s, 1.2 GB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.47629 s, 1.2 GB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.4952 s, 1.2 GB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.50229 s, 1.2 GB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.52185 s, 1.2 GB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.53337 s, 1.2 GB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.60111 s, 1.2 GB/s

[root@ice01 scripts]# ./tmult.sh 4096 8
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.5593 s, 1.3 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.60015 s, 1.3 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.721 s, 1.3 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.75103 s, 1.3 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.77716 s, 1.3 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.85576 s, 1.3 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.85757 s, 1.3 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.89447 s, 1.2 GB/s

[root@ice01 scripts]# ./tmult.sh 8192 8
17179869184 bytes (17 GB, 16 GiB) copied, 12.7842 s, 1.3 GB/s
17179869184 bytes (17 GB, 16 GiB) copied, 12.7889 s, 1.3 GB/s
17179869184 bytes (17 GB, 16 GiB) copied, 12.9504 s, 1.3 GB/s
17179869184 bytes (17 GB, 16 GiB) copied, 12.9577 s, 1.3 GB/s
17179869184 bytes (17 GB, 16 GiB) copied, 13.4066 s, 1.3 GB/s
17179869184 bytes (17 GB, 16 GiB) copied, 13.5397 s, 1.3 GB/s
17179869184 bytes (17 GB, 16 GiB) copied, 13.5769 s, 1.3 GB/s
17179869184 bytes (17 GB, 16 GiB) copied, 13.6605 s, 1.3 GB/s

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="369344" author="qian_wc" created="Thu, 13 Apr 2023 08:28:21 +0000"  >&lt;p&gt;Two process:&lt;br/&gt;
One is under the memcg control with memory limits varying from 1G to 128G;&lt;br/&gt;
Another is not under memcg control;&lt;br/&gt;
Each writes 128G data in total&lt;br/&gt;
Test scripts:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
error() {
        echo &lt;span class=&quot;code-quote&quot;&gt;&quot;$@&quot;&lt;/span&gt;
        exit 1
}

DIR=&lt;span class=&quot;code-quote&quot;&gt;&quot;/exafs&quot;&lt;/span&gt;
tdir=&lt;span class=&quot;code-quote&quot;&gt;&quot;milti&quot;&lt;/span&gt;
tfile=&lt;span class=&quot;code-quote&quot;&gt;&quot;test&quot;&lt;/span&gt;
dir=$DIR/$tdir
file=$dir/$tfile
cgfile=$dir/${tfile}.cg
cg_basedir=/sys/fs/cgroup/memory
cgdir=$cg_basedir/$tfile
memlimit_mb=$1

rm -rf $dir
sleep 2
mkdir $dir || error &lt;span class=&quot;code-quote&quot;&gt;&quot;failed to mkdir $dir&quot;&lt;/span&gt;

cgdir=$cg_basedir/${tfile}
mkdir $cgdir || error &lt;span class=&quot;code-quote&quot;&gt;&quot;failed to mkdir $cgdir&quot;&lt;/span&gt;
echo $((memlimit_mb * 1024 * 1024)) &amp;gt; $cgdir/memory.limit_in_bytes
cat $cgdir/memory.limit_in_bytes

echo 3 &amp;gt; /proc/sys/vm/drop_caches
(
bash -c &lt;span class=&quot;code-quote&quot;&gt;&quot;echo \$$ &amp;gt; $cgdir/tasks &amp;amp;&amp;amp; dd &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt;=/dev/zero of=${cgfile} bs=1M count=128000&quot;&lt;/span&gt;
)&amp;amp;
cgpid=$!
dd &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt;=/dev/zero of=$file bs=1M count=128000 &amp;amp;
pid=$!

wait $cgpid
wait $pid
rmdir $cgdir

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The results are shown as following:&lt;/p&gt;

&lt;p&gt;./t2p.sh $memlimit_mb&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
[root@ice01 scripts]# ./t2p.sh 1024
134217728000 bytes (134 GB, 125 GiB) copied, 61.5799 s, 2.2 GB/s
134217728000 bytes (134 GB, 125 GiB) copied, 103.386 s, 1.3 GB/s

[root@ice01 scripts]# ./t2p.sh 4096
134217728000 bytes (134 GB, 125 GiB) copied, 62.1537 s, 2.2 GB/s
134217728000 bytes (134 GB, 125 GiB) copied, 101.473 s, 1.3 GB/s

[root@ice01 scripts]# ./t2p.sh 16384
134217728000 bytes (134 GB, 125 GiB) copied, 60.7237 s, 2.2 GB/s
134217728000 bytes (134 GB, 125 GiB) copied, 93.3043 s, 1.4 GB/s

[root@ice01 scripts]# ./t2p.sh 32768
134217728000 bytes (134 GB, 125 GiB) copied, 61.221 s, 2.2 GB/s
134217728000 bytes (134 GB, 125 GiB) copied, 88.7582 s, 1.5 GB/s

[root@ice01 scripts]# ./t2p.sh 65536
134217728000 bytes (134 GB, 125 GiB) copied, 62.0085 s, 2.2 GB/s
134217728000 bytes (134 GB, 125 GiB) copied, 75.543 s, 1.8 GB/s

[root@ice01 scripts]# ./t2p.sh 131072
134217728000 bytes (134 GB, 125 GiB) copied, 63.2751 s, 2.1 GB/s
134217728000 bytes (134 GB, 125 GiB) copied, 64.2502 s, 2.1 GB/s
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The results demonstrates that the process with memcg limits nearly has no impact on the performance of the process without memcg limits.&lt;/p&gt;</comment>
                            <comment id="369357" author="gerrit" created="Thu, 13 Apr 2023 12:32:40 +0000"  >&lt;p&gt;&quot;Qian Yingjin &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/50625&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/50625&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16713&quot; title=&quot;Writeback and commit pages under memory pressure to avoid OOM&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16713&quot;&gt;&lt;del&gt;LU-16713&lt;/del&gt;&lt;/a&gt; llite: add __GFP_NORETRY for read-ahead page&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 0eeec25cb304a178258ce2fceaf2fa854ac491b7&lt;/p&gt;</comment>
                            <comment id="371489" author="JIRAUSER17900" created="Mon, 8 May 2023 04:41:57 +0000"  >&lt;p&gt;2023-05-13: Two patch for the ticket, one patch landed to master, another one is being worked on.&lt;/p&gt;</comment>
                            <comment id="371596" author="gerrit" created="Tue, 9 May 2023 05:47:09 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/50625/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/50625/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16713&quot; title=&quot;Writeback and commit pages under memory pressure to avoid OOM&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16713&quot;&gt;&lt;del&gt;LU-16713&lt;/del&gt;&lt;/a&gt; llite: add __GFP_NORETRY for read-ahead page&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 8db5d39f669f03aa6d8ad4962f82453b3cc11b42&lt;/p&gt;</comment>
                            <comment id="380182" author="JIRAUSER17900" created="Wed, 26 Jul 2023 12:53:21 +0000"  >&lt;p&gt;2023-07-26: Two patches for the ticket, one patch landed to master, another patch&apos;s depending patch is being worked on.&lt;/p&gt;</comment>
                            <comment id="382849" author="qian_wc" created="Thu, 17 Aug 2023 16:50:45 +0000"  >&lt;p&gt;These expensive frequent fsync() calls will lead much frequent journal commit on the stroage server, and the journaling overhead becomes rather significant, causing performance drops.&lt;/p&gt;

&lt;p&gt;We have designed a mechanism called &lt;b&gt;soft sync&lt;/b&gt;. The client accounts the number of unstable pages bwtween the client/server pair. Upon the completion of a write I/O request, the client adds the corresponing inode, which has pinned uncommitted pages, into dirty list of the super block or cgroup. And then it increases the unstable account accordingly. Any reply from the server will piggyback the last committed transno (&lt;b&gt;last_committed&lt;/b&gt;) on this server, the client will commit write I/O request with transno smaller than &lt;b&gt;last_committed&lt;/b&gt;, unpin the uncommitted pages and decreases the unstable page account accordingly. When the system is under memory pressure, the kernel writeback thread will be woken up, and start to write out data of the inodes in dirty list to reclaim pages. If the writeback purpose is to commit the pinned pages, the client first flush dirty pages to servers if any. If unstable page count between this client/server pair is zero, it means all unstable pages have already committed, the client just returns immediately. Otherwise, the client sends a soft sync request to the server with a factor to indicate the urgency degree of its memory pressure. The intention for this operation is to commit pages belonging to a client which has too many outstanding unstable pages in its cache. The server will determine whether to begin an asynchronous journal commit based on the number of the soft sync the clients requesting and the time since its last commit. The server has a tunable global limit (named &lt;b&gt;soft_sync_thrsh&lt;/b&gt;) across all clients. It defines how many soft sync request allowed before a asynchronous journal commit will be triggered. And its value is 16 by default. Every soft sync request from a client contributes the soft sync value on the server. The soft sync factor is calculated based on the memory usage on a client by the formula: (1 - &lt;b&gt;free_memory&lt;/b&gt; / &lt;b&gt;tot_memory&lt;/b&gt;) &amp;#42; &lt;b&gt;soft_sync_thrsh&lt;/b&gt; where &lt;b&gt;free_memory&lt;/b&gt; is the free memory size, &lt;b&gt;tot_memory&lt;/b&gt; is the total memory size, they could be entire system based or cgroup based.&lt;/p&gt;

&lt;p&gt;Once the accumulated soft sync value is larger than the predefined threshold, a asynchronous sync() is called on the server to start journal commit. The soft sync mechanism makes a tradeoff between the urgency of reducing memory pressure and server throughput. It will dynamically shorten the journal commit interval on the server to avoid pinning pages longer if a client is under memory pressure.&lt;/p&gt;</comment>
                            <comment id="387027" author="gerrit" created="Sat, 23 Sep 2023 06:08:48 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/52485&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/52485&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16713&quot; title=&quot;Writeback and commit pages under memory pressure to avoid OOM&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16713&quot;&gt;&lt;del&gt;LU-16713&lt;/del&gt;&lt;/a&gt; llite: remove unused ccc_unstable_waitq&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: ac78bfbf610e6d524f024b1b263f23046cabcfcb&lt;/p&gt;</comment>
                            <comment id="387275" author="gerrit" created="Tue, 26 Sep 2023 14:33:42 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/50544/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/50544/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16713&quot; title=&quot;Writeback and commit pages under memory pressure to avoid OOM&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16713&quot;&gt;&lt;del&gt;LU-16713&lt;/del&gt;&lt;/a&gt; llite: writeback/commit pages under memory pressure&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 8aa231a994683a9224d42c0e7ae48aaebe2f583c&lt;/p&gt;</comment>
                            <comment id="387276" author="gerrit" created="Tue, 26 Sep 2023 14:33:55 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/52485/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/52485/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16713&quot; title=&quot;Writeback and commit pages under memory pressure to avoid OOM&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16713&quot;&gt;&lt;del&gt;LU-16713&lt;/del&gt;&lt;/a&gt; llite: remove unused ccc_unstable_waitq&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 03a795efa44253e06d6feef0ad613f2da0269c5b&lt;/p&gt;</comment>
                            <comment id="387278" author="pjones" created="Tue, 26 Sep 2023 14:36:06 +0000"  >&lt;p&gt;Landed for 2.16&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="75386">LU-16697</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="78156">LU-17151</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="78349">LU-17183</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="75385">LU-16696</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i03i5b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>