<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:03:05 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6768] Data corruption when write and truncate in parallel in a almost-full file system</title>
                <link>https://jira.whamcloud.com/browse/LU-6768</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;In order to test the stability of the Lustre file system under continuous workload and extreme resource usage, I wrote a tool to write data to files continuously unless &apos;ENOSPC&apos; occurs, then the tool will truncate and delete some old files to free the space and continue to write the files. Before the file is truncated, its content will be verified. The problem is that after running the tool for a while, the content of some file would be wrong.&lt;/p&gt;

&lt;p&gt;If the data is corrupted, it will fail with:&lt;br/&gt;
yaft: main.cpp:81: void check_file_content(const std::string&amp;amp;): Assertion `rbuf.checkAt(pos)&apos; failed.&lt;br/&gt;
Aborted&lt;/p&gt;

&lt;p&gt;You can get the tool at:&lt;br/&gt;
&lt;a href=&quot;https://github.com/zhang-jingwang/yaft.git&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/zhang-jingwang/yaft.git&lt;/a&gt;&lt;/p&gt;</description>
                <environment>Reproduced in a virtual machine using loop device as OSD-ldiskfs disk.</environment>
        <key id="30836">LU-6768</key>
            <summary>Data corruption when write and truncate in parallel in a almost-full file system</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bzzz">Alex Zhuravlev</assignee>
                                    <reporter username="jingwang">Jingwang Zhang</reporter>
                        <labels>
                    </labels>
                <created>Fri, 26 Jun 2015 02:00:22 +0000</created>
                <updated>Mon, 14 Sep 2015 14:37:27 +0000</updated>
                            <resolved>Mon, 3 Aug 2015 15:09:42 +0000</resolved>
                                    <version>Lustre 2.6.0</version>
                                    <fixVersion>Lustre 2.8.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="119678" author="jingwang" created="Fri, 26 Jun 2015 02:03:28 +0000"  >&lt;p&gt;My analysis about this problem in an email:&lt;br/&gt;
During our tests on Lustre, we found a data corruption when we write to files and truncating/deleting other files at the same time. Usually we could find that one block in the file will contain wrong data which seems very likely to be a metadata block (ext4 extents).&lt;/p&gt;

&lt;p&gt;And after a long time investigation, we finally found out the root cause for the problem. In short, this is due to a race condition between the data IO (Lustre performs them directly to the block device by submit_bio()) and the metadata IO (ldiskfs performs metadata IO to the block device, which might be cached in the block device&#8217;s page cache).&lt;/p&gt;

&lt;p&gt;Lustre would commit data IO from clients by the osd_submit_bio() function in osd-ldiskfs/osd_io.c :&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-style: solid;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeHeader panelHeader&quot; style=&quot;border-bottom-width: 1px;border-bottom-style: solid;&quot;&gt;&lt;b&gt;osd_io.c&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;241 &lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; void osd_submit_bio(&lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; rw, struct bio *bio)
242 {
243         LASSERTF(rw == 0 || rw == 1, &lt;span class=&quot;code-quote&quot;&gt;&quot;%x\n&quot;&lt;/span&gt;, rw);
244         &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (rw == 0)
245                 submit_bio(READ, bio);
246         &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt;
247                 submit_bio(WRITE, bio);
248 }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;However, there might be dirty data in the block&#8217;s page cache. It&#8217;s rare but it does might happen if things happen as following order:&lt;br/&gt;
1.	A file got truncated, and its extent blocks are updated to complete the truncation. So those blocks became dirty.&lt;br/&gt;
2.	The file is deleted, so all its metadata blocks are free now.&lt;br/&gt;
3.	One of the metadata blocks is used as data block to hold client&#8217;s data. It will be updated by osd_submit_bio().&lt;br/&gt;
4.	Then the kernel decides to flush the dirty pages, the data block will be overwritten by the dirty metadata and the data is corrupted.&lt;/p&gt;

&lt;p&gt;So I think this is a problem and I think the right thing to do is to invalidate the corresponding pages in the block device&#8217;s page cache before we issue the bio to make sure that there aren&#8217;t any dirty pages, which might overwrite our data later. So we propose to make the following change on osd_submit_bio() to invalidate the page cache.&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-style: solid;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeHeader panelHeader&quot; style=&quot;border-bottom-width: 1px;border-bottom-style: solid;&quot;&gt;&lt;b&gt;osd_io.c&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;&lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; void osd_submit_bio(&lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; rw, struct bio *bio)
{
        struct inode *bdinode = bio-&amp;gt;bi_bdev-&amp;gt;bd_inode;

        LASSERTF(rw == 0 || rw == 1, &lt;span class=&quot;code-quote&quot;&gt;&quot;%x\n&quot;&lt;/span&gt;, rw);
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (rw == 0) {
                submit_bio(READ, bio);
        } &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt; {
                loff_t start = bio-&amp;gt;bi_sector &amp;lt;&amp;lt; 9;
                loff_t endbyte = start + bio-&amp;gt;bi_size - 1;

                /* Invalidate the page cache in the block device, otherwise
                 * the dirty data in block device&apos;s page cache might corrupt
                 * the data we are going to write. */
                truncate_pagecache_range(bdinode, start, endbyte);
                submit_bio(WRITE, bio);
        }
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="120300" author="pjones" created="Fri, 3 Jul 2015 18:11:33 +0000"  >&lt;p&gt;Alex&lt;/p&gt;

&lt;p&gt;Could you please advise on this issue?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="121130" author="bzzz" created="Mon, 13 Jul 2015 13:34:30 +0000"  >&lt;p&gt;&amp;gt; 1.	A file got truncated, and its extent blocks are updated to complete the truncation. So those blocks became dirty.&lt;br/&gt;
&amp;gt; 2.	The file is deleted, so all its metadata blocks are free now.&lt;br/&gt;
&amp;gt; 3.	One of the metadata blocks is used as data block to hold client&#8217;s data. It will be updated by osd_submit_bio().&lt;/p&gt;

&lt;p&gt;(2) is not quite correct, metadata blocks aren&apos;t freed immediately, instead they are scheduled for release up on commit.&lt;br/&gt;
see ldiskfs_mb_free_blocks() for the details - when metadata != 0.&lt;/p&gt;

&lt;p&gt;also, we already have calls to unmap_underlying_metadata() which is supposed to do what you suggested.&lt;br/&gt;
I&apos;m still looking at the code.&lt;/p&gt;

</comment>
                            <comment id="121137" author="bzzz" created="Mon, 13 Jul 2015 13:58:35 +0000"  >&lt;p&gt;what kernel version do you use?&lt;/p&gt;</comment>
                            <comment id="121198" author="gerrit" created="Mon, 13 Jul 2015 20:39:32 +0000"  >&lt;p&gt;Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/15593&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/15593&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6768&quot; title=&quot;Data corruption when write and truncate in parallel in a almost-full file system&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6768&quot;&gt;&lt;del&gt;LU-6768&lt;/del&gt;&lt;/a&gt; osd: unmap reallocated blocks&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: facf7f47ced7debdbaf0d814158661857176b4bf&lt;/p&gt;</comment>
                            <comment id="121200" author="bzzz" created="Mon, 13 Jul 2015 20:40:14 +0000"  >&lt;p&gt;Jingwang, would you mind to try &lt;a href=&quot;http://review.whamcloud.com/15593&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/15593&lt;/a&gt; please?&lt;/p&gt;</comment>
                            <comment id="121217" author="jingwang" created="Tue, 14 Jul 2015 02:14:16 +0000"  >&lt;p&gt;Thanks for looking into this.&lt;/p&gt;

&lt;p&gt;I&apos;m using CentOS 6.5 with kernel version 2.6.32.431.29.2. And I will try the fix and get back to you later.&lt;/p&gt;</comment>
                            <comment id="121230" author="jingwang" created="Tue, 14 Jul 2015 09:41:09 +0000"  >&lt;p&gt;I run the reproducer for 7 hours and it didn&apos;t fail after applying the patch, where it will fail in minutes without the patch, so I believe that the problem is fixed.&lt;/p&gt;</comment>
                            <comment id="121231" author="bzzz" created="Tue, 14 Jul 2015 09:55:33 +0000"  >&lt;p&gt;thanks for the report and testing. please, inspect the patch and help to move it forward.&lt;/p&gt;</comment>
                            <comment id="122965" author="gerrit" created="Mon, 3 Aug 2015 01:54:38 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/15593/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/15593/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6768&quot; title=&quot;Data corruption when write and truncate in parallel in a almost-full file system&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6768&quot;&gt;&lt;del&gt;LU-6768&lt;/del&gt;&lt;/a&gt; osd: unmap reallocated blocks&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: bcef61a80ab4fa6cee847722184738ba4deeb971&lt;/p&gt;</comment>
                            <comment id="123011" author="pjones" created="Mon, 3 Aug 2015 15:09:42 +0000"  >&lt;p&gt;Landed for 2.8&lt;/p&gt;</comment>
                            <comment id="123271" author="jaylan" created="Tue, 4 Aug 2015 22:51:43 +0000"  >&lt;p&gt;This may help us on &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6925&quot; title=&quot;oss buffer cache corruption&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6925&quot;&gt;&lt;del&gt;LU-6925&lt;/del&gt;&lt;/a&gt;. Can we get a b2_5 back port? Thanks!&lt;/p&gt;</comment>
                            <comment id="123528" author="jaylan" created="Thu, 6 Aug 2015 21:14:19 +0000"  >&lt;p&gt;Is the patch needed by server or client, or both?&lt;br/&gt;
Looks like a server patch.&lt;/p&gt;</comment>
                            <comment id="123529" author="gerrit" created="Thu, 6 Aug 2015 21:21:56 +0000"  >&lt;p&gt;Jian Yu (jian.yu@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/15904&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/15904&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6768&quot; title=&quot;Data corruption when write and truncate in parallel in a almost-full file system&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6768&quot;&gt;&lt;del&gt;LU-6768&lt;/del&gt;&lt;/a&gt; lvfs: unmap reallocated blocks&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_5&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: c8448bce0ad13aeb65c48905e080ddb0c536fc91&lt;/p&gt;</comment>
                            <comment id="123530" author="yujian" created="Thu, 6 Aug 2015 21:28:04 +0000"  >&lt;p&gt;Hi Jay,&lt;br/&gt;
The patch is only needed by server.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="31268">LU-6925</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxgpj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>