<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:18:06 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15413] WBC: endless loop in balance_dirty_pages</title>
                <link>https://jira.whamcloud.com/browse/LU-15413</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;When write a larger file into Lustre with WBC enabled (aging_keep flush mode), it trapped into a endless loop:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
dd &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt;=/dev/zero of=/mnt/lustre/tdir/tfile bs=1M count=4096

cat /proc/22735/stack
[&amp;lt;0&amp;gt;] balance_dirty_pages+0x426/0xcd0
[&amp;lt;0&amp;gt;] balance_dirty_pages_ratelimited+0x2af/0x3b0
[&amp;lt;0&amp;gt;] generic_perform_write+0x16a/0x1b0
[&amp;lt;0&amp;gt;] __generic_file_write_iter+0xfa/0x1c0
[&amp;lt;0&amp;gt;] generic_file_write_iter+0xab/0x150
[&amp;lt;0&amp;gt;] memfs_file_write_iter+0xd7/0x180 [lustre]
[&amp;lt;0&amp;gt;] new_sync_write+0x124/0x170
[&amp;lt;0&amp;gt;] vfs_write+0xa5/0x1a0
[&amp;lt;0&amp;gt;] ksys_write+0x4f/0xb0
[&amp;lt;0&amp;gt;] do_syscall_64+0x5b/0x1b0
[&amp;lt;0&amp;gt;] entry_SYSCALL_64_after_hwframe+0x65/0xca
[&amp;lt;0&amp;gt;] 0xffffffffffffffff
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The reason is because that the kernel found the current writing process tries to write out some dirty pages in @balance_dirty_pages() due to the rate limit mechanism in Linux kernel, but the pages are pinned in MemFS, and are not reclaimable.&lt;br/&gt;
We found that for a client with 96G memory, it will trap into the endless loop when the write size is larger than 8G.&lt;/p&gt;

&lt;p&gt;Here, there are two solution:&lt;br/&gt;
one solution is to disable dirty account for the DBI:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
sb-&amp;gt;s_bdi-&amp;gt;capabilities |= BDI_CAP_NO_ACCT_DIRTY;

void balance_dirty_pages_ratelimited(struct address_space *mapping)
{
	struct inode *inode = mapping-&amp;gt;host;
	struct backing_dev_info *bdi = inode_to_bdi(inode);
	struct bdi_writeback *wb = NULL;
	&lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; ratelimit;
	&lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; *p;

	&lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (!bdi_cap_account_dirty(bdi))
		&lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt;;
      ...
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;By this way, it will not trigger to call balance_dirty_pages. It can write as many cache pages as possible before reaching the page cache limits in MemFS.&lt;/p&gt;

&lt;p&gt;Another solution is that:&lt;br/&gt;
when write-out inode in @balance_dirty_pages-&amp;gt;wb_start_background_writeback(),  the client assimilates the cache pages from MemFS into Lustre, after that the assimilated pages in Lustre are reclaimable, the dirty pages can be written out to Lustre backend.&lt;/p&gt;
</description>
                <environment></environment>
        <key id="67825">LU-15413</key>
            <summary>WBC: endless loop in balance_dirty_pages</summary>
                <type id="7" iconUrl="https://jira.whamcloud.com/images/icons/issuetypes/task_agile.png">Technical task</type>
                            <parent id="51932">LU-10938</parent>
                                    <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="qian_wc">Qian Yingjin</assignee>
                                    <reporter username="qian_wc">Qian Yingjin</reporter>
                        <labels>
                    </labels>
                <created>Thu, 6 Jan 2022 09:17:19 +0000</created>
                <updated>Mon, 6 Jun 2022 08:33:53 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="321870" author="gerrit" created="Thu, 6 Jan 2022 09:49:32 +0000"  >&lt;p&gt;&quot;Yingjin Qian &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/45988&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45988&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15413&quot; title=&quot;WBC: endless loop in balance_dirty_pages&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15413&quot;&gt;LU-15413&lt;/a&gt; wbc: disable accounting for dirty pages&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: a42813460f149769ced6cad4bda715a1e17e58b6&lt;/p&gt;</comment>
                            <comment id="322042" author="gerrit" created="Fri, 7 Jan 2022 09:50:55 +0000"  >&lt;p&gt;&quot;Yingjin Qian &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/45997&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45997&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15413&quot; title=&quot;WBC: endless loop in balance_dirty_pages&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15413&quot;&gt;LU-15413&lt;/a&gt; wbc: assimilate inode cache pages for large write&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: e31ea2a117e7c7078d112cb99d4cc4643f101b44&lt;/p&gt;</comment>
                            <comment id="336800" author="gerrit" created="Mon, 6 Jun 2022 08:33:53 +0000"  >&lt;p&gt;&quot;Yingjin Qian &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47541&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47541&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15413&quot; title=&quot;WBC: endless loop in balance_dirty_pages&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15413&quot;&gt;LU-15413&lt;/a&gt; wbc: assimilation for data under @wbci_rw_sem&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: e13359e5878781643eae4a4f325681a8860c25bd&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i02dsn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>