<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:42:16 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4388] fsync on client does not cause OST_SYNCs to be issued</title>
                <link>https://jira.whamcloud.com/browse/LU-4388</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Issuing an fsync() only causes MDS_SYNC&apos;s to be issued.&lt;/p&gt;

&lt;p&gt;Testing:&lt;br/&gt;
dd with conv=fsync vs. dd with oflag=sync&lt;/p&gt;</description>
                <environment></environment>
        <key id="22493">LU-4388</key>
            <summary>fsync on client does not cause OST_SYNCs to be issued</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="utopiabound">Nathaniel Clark</reporter>
                        <labels>
                            <label>MB</label>
                    </labels>
                <created>Tue, 17 Dec 2013 15:55:28 +0000</created>
                <updated>Thu, 14 Jun 2018 21:41:37 +0000</updated>
                            <resolved>Thu, 24 Apr 2014 13:24:14 +0000</resolved>
                                    <version>Lustre 2.5.0</version>
                    <version>Lustre 2.6.0</version>
                                    <fixVersion>Lustre 2.6.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>13</watches>
                                                                            <comments>
                            <comment id="73882" author="adilger" created="Thu, 19 Dec 2013 20:28:08 +0000"  >&lt;p&gt;Looking at the ll_fsync() code path I see a number of gaps in the handling.  This should be passing through the file range for fsync, since it is possible to sync only a range of the file.  The Lustre OST protocol has always supported this, and it makes sense to implement it correctly.&lt;/p&gt;

&lt;p&gt;On the OST side, it looks like ofd_sync() passes start/end correctly, but the new ofd_sync_hdl() does not.  The whole callchain for tgt_sync-&amp;gt;dt_object_sync-&amp;gt;osd_object_sync-&amp;gt;do_sync() need to take start and end arguments, and then pass it down to f_op-&amp;gt;fsync() for those newer variants of fsync() that allow specifying a range of the file.&lt;/p&gt;

&lt;p&gt;I pushed &lt;a href=&quot;http://review.whamcloud.com/8626&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8626&lt;/a&gt; to fix handling of &lt;/p&gt;
{start, end}
&lt;p&gt; limits for fsync, but this will not fix the core problem of OST_SYNC not being sent at all.  Looking at the twisty maze of CLIO that the fsync call passes through, it &lt;em&gt;seems&lt;/em&gt; like the fsync() call should generate an OST_SYNC RPC, but I didn&apos;t look at actual debug logs to trace the callpath to see where it actually goes wrong.&lt;/p&gt;

&lt;p&gt;It is probably worthwhile to create a test case that writes 900kB (less than one RPC, so it isn&apos;t sent immediately), fsync (via &quot;&lt;tt&gt;multiop -y&lt;/tt&gt;&quot;), immediately fail the OST (mark it read-only) and abort recovery when fsync() returns, then remount the client and verify that the data actually made it to disk.  There are sanity.sh test_24&lt;/p&gt;
{a,b}
&lt;p&gt;() that test fsync() behaviour in terms of returning an error to the caller on failure, but there isn&apos;t a test that verifies fsync() is actually working correctly.  Since we&apos;ve had problems in this area several times (it is not a problem that can be seen easily, since the data is normally written to disk within a few seconds by itself anyway), it probably makes sense to start by writing this test first, and then using it to debug the problem.&lt;/p&gt;</comment>
                            <comment id="73924" author="adilger" created="Fri, 20 Dec 2013 11:14:35 +0000"  >&lt;p&gt;Note that my patch is not addressing the core issue here, just a problem that I found when I was looking at this bug originally.  Someone with more CLIO skill or time to debug needs to look at this problem.&lt;/p&gt;</comment>
                            <comment id="74059" author="pjones" created="Tue, 24 Dec 2013 14:40:52 +0000"  >&lt;p&gt;Bobijam&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="74060" author="bobijam" created="Tue, 24 Dec 2013 15:14:58 +0000"  >&lt;p&gt;Nathaniel,&lt;/p&gt;

&lt;p&gt;Can you describe the problem in detail? What&apos;s the full commands and the respective outputs?&lt;/p&gt;</comment>
                            <comment id="74103" author="bobijam" created="Fri, 27 Dec 2013 06:00:13 +0000"  >&lt;p&gt;dd ... conv=fsync issues several ll_writepages() with wbc-&amp;gt;sync_mode == WB_SYNC_ALL and it calls &lt;br/&gt;
cl_sync_file_range(inode, start, end, CL_FSYNC_LOCAL, ignore_layout);&lt;/p&gt;

&lt;p&gt;after write finishes, it calls ll_fsync() with datasync==0&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000080:00000001:0.0:1388123592.960635:0:20741:0:(rw.c:1112:ll_writepages()) Process entered
...
00000080:00200000:0.0:1388123592.960646:0:20741:0:(file.c:2742:cl_sync_file_range()) VFS Op:inode=[0x200000400:0x1:0x0](ffff8800006da138), sync_mode=1
...
00000080:00000001:1.0:1388123592.973789:0:20741:0:(file.c:2778:cl_sync_file_range()) Process leaving (rc=0 : 0 : 0)
00000080:00000001:1.0:1388123592.973791:0:20741:0:(rw.c:1147:ll_writepages()) Process leaving (rc=0 : 0 : 0)
00000080:00000001:1.0:1388123592.973794:0:20741:0:(file.c:2804:ll_fsync()) Process entered
00000080:00200000:1.0:1388123592.973794:0:20741:0:(file.c:2807:ll_fsync()) VFS Op:inode=[0x200000400:0x1:0x0](ffff8800006da138), datasync=0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="74104" author="bobijam" created="Fri, 27 Dec 2013 06:17:25 +0000"  >&lt;p&gt;Nathaniel,&lt;/p&gt;

&lt;p&gt;Use conv=fdatasync will issue the OST_SYNC.&lt;/p&gt;

&lt;p&gt;dd conv=fsync does not set the datasync parameter for fsync(), conv=fdatasync set it.&lt;/p&gt;</comment>
                            <comment id="74132" author="adilger" created="Fri, 27 Dec 2013 21:10:46 +0000"  >&lt;p&gt;Does this mean there is a bug in out code that it doesn&apos;t send an OST_SYNC in case of conv=fsync?&lt;/p&gt;</comment>
                            <comment id="74134" author="bobijam" created="Fri, 27 Dec 2013 23:26:49 +0000"  >&lt;p&gt;A little bit confused about the datasync parameter of fsync(), is it datasync=1 means only sync data while datasync=0 means sync data+metadata?&lt;/p&gt;</comment>
                            <comment id="74167" author="adilger" created="Mon, 30 Dec 2013 18:04:27 +0000"  >&lt;p&gt;Looking at the comments around vfs_fsync() and vfs_fsync_range()&lt;br/&gt;
&lt;a href=&quot;http://lxr.free-electrons.com/source/fs/sync.c#L178&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://lxr.free-electrons.com/source/fs/sync.c#L178&lt;/a&gt;&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt; * vfs_fsync - perform a fsync or fdatasync on a file
 * @file:               file to sync
 * @datasync:           only perform a fdatasync operation
 *
 * Write back data and metadata &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; @file to disk.  If @datasync is
 * set only metadata needed to access modified file data is written.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It does appear that datasync=1 means sync data and enough metadata to get the data back (e.g. i_size and block allocations) while datasync=0 means sync everything.  See also &lt;a href=&quot;http://man7.org/linux/man-pages/man2/fsync.2.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://man7.org/linux/man-pages/man2/fsync.2.html&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;       fdatasync() is similar to fsync(), but does not flush modified&lt;br/&gt;
       metadata unless that metadata is needed in order to allow a&lt;br/&gt;
       subsequent data retrieval to be correctly handled.  For example,&lt;br/&gt;
       changes to st_atime or st_mtime (respectively, time of last access&lt;br/&gt;
       and time of last modification; see stat(2)) do not require flushing&lt;br/&gt;
       because they are not necessary for a subsequent data read to be&lt;br/&gt;
       handled correctly.  On the other hand, a change to the file size&lt;br/&gt;
       (st_size, as made by say ftruncate(2)), would require a metadata&lt;br/&gt;
       flush.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;So if datasync=0 then we should always be sending OST_SYNC RPCs, but if datasync=1 then that is only needed if the size is changed or if blocks were allocated.  That is a difficult/impossible decision for a client to make, so the OST_SYNC RPC should be sent to the OST with a DATASYNC flag and let the OSD decide.  For older branches it probably makes sense to always send OST_SYNC. &lt;/p&gt;</comment>
                            <comment id="74183" author="jay" created="Tue, 31 Dec 2013 02:18:55 +0000"  >&lt;p&gt;I see. So this is a misunderstanding of @datasync parameter. This bug may exist in all series of lustre client implementation. Thanks for explanation.&lt;/p&gt;</comment>
                            <comment id="74186" author="bobijam" created="Tue, 31 Dec 2013 08:18:05 +0000"  >&lt;p&gt;patch tracking at &lt;a href=&quot;http://review.whamcloud.com/8684&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8684&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="79109" author="jlevi" created="Wed, 12 Mar 2014 12:29:39 +0000"  >&lt;p&gt;Patch has landed to Master. Please reopen if more work is needed in this ticket.&lt;/p&gt;</comment>
                            <comment id="79974" author="adilger" created="Fri, 21 Mar 2014 10:17:43 +0000"  >&lt;p&gt;The patch &lt;a href=&quot;http://review.whamcloud.com/8626&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8626&lt;/a&gt; has not landed yet.&lt;/p&gt;</comment>
                            <comment id="82378" author="jlevi" created="Thu, 24 Apr 2014 13:24:14 +0000"  >&lt;p&gt;Patch landed to Master.&lt;/p&gt;</comment>
                            <comment id="115551" author="happe" created="Fri, 15 May 2015 21:32:46 +0000"  >&lt;p&gt;Should this have been fixed after 2.5.0 or is it only 2.6.0? I see the same problem with b2_5 on zfs (fdatasync okay and slow, fsync not touching the disk). I think it is important that fsync works as expected.&lt;/p&gt;</comment>
                            <comment id="115581" author="bobijam" created="Sat, 16 May 2015 03:33:41 +0000"  >&lt;p&gt;2.6.0 has the patches and 2.5.0 does not.&lt;/p&gt;</comment>
                            <comment id="115607" author="happe" created="Sun, 17 May 2015 20:00:19 +0000"  >&lt;p&gt;Given the severity (data loss, inconsistent databases, etc..), shouldn&apos;t it be included in the maintenance release?&lt;/p&gt;</comment>
                            <comment id="115616" author="gerrit" created="Mon, 18 May 2015 03:09:35 +0000"  >&lt;p&gt;Bobi Jam (bobijam@hotmail.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/14840&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/14840&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4388&quot; title=&quot;fsync on client does not cause OST_SYNCs to be issued&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4388&quot;&gt;&lt;del&gt;LU-4388&lt;/del&gt;&lt;/a&gt; llite: issue OST_SYNC for fsync()&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_5&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 3951a7c7aed155c206043838e4e800c94ac1c692&lt;/p&gt;</comment>
                            <comment id="119566" author="gerrit" created="Thu, 25 Jun 2015 02:47:42 +0000"  >&lt;p&gt;Bobi Jam (bobijam@hotmail.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/15390&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/15390&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4388&quot; title=&quot;fsync on client does not cause OST_SYNCs to be issued&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4388&quot;&gt;&lt;del&gt;LU-4388&lt;/del&gt;&lt;/a&gt; test: add sanity test case&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 3112c90215374573c987a800c192300817c62200&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwbiv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>12048</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>