<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:26:24 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2580] cp with FIEMAP support creates completely sparse file</title>
                <link>https://jira.whamcloud.com/browse/LU-2580</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We are seeing an issue at KIT where cp will occasionally use the FIEMAP extension to create a completely sparse file instead of actually copying the file. It seems to occur under a workload involving creating and deleting many files at once. It only involves a single client though, it&apos;s not a parallel workload. &lt;/p&gt;

&lt;p&gt;Relevant strace from &apos;bad&apos; cp:&lt;br/&gt;
ioctl(3, 0xc020660b, 0x7fff392c0950)    = 0&lt;br/&gt;
ftruncate(4, 12853)                     = 0&lt;/p&gt;

&lt;p&gt;strace from &apos;good&apos; cp:&lt;br/&gt;
read(3, &quot;#!/bin/bash -u\n\n#localisation\nex&quot;..., 2097152) = 12853&lt;br/&gt;
write(4, &quot;#!/bin/bash -u\n\n#localisation\nex&quot;..., 12853) = 12853&lt;br/&gt;
read(3, &quot;&quot;, 2097152)                    = 0&lt;/p&gt;

&lt;p&gt;The strace didn&apos;t print the stat block information, but I&apos;m assuming the st_blocks == 0 in the bad one. I will ask the customer to get a full strace -v to confirm, but it appears to be something similar to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-417&quot; title=&quot;block usage is reported as zero by stat call for tens of seconds after creating a file&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-417&quot;&gt;&lt;del&gt;LU-417&lt;/del&gt;&lt;/a&gt;?&lt;/p&gt;</description>
                <environment>SLES 11 SP2 (client), Lustre 2.1.2 RHEL6 (server)</environment>
        <key id="17095">LU-2580</key>
            <summary>cp with FIEMAP support creates completely sparse file</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="pjones">Peter Jones</assignee>
                                    <reporter username="kitwestneat">Kit Westneat</reporter>
                        <labels>
                            <label>LB</label>
                    </labels>
                <created>Mon, 7 Jan 2013 10:24:38 +0000</created>
                <updated>Wed, 24 Apr 2013 17:55:19 +0000</updated>
                            <resolved>Tue, 5 Mar 2013 11:29:27 +0000</resolved>
                                    <version>Lustre 2.3.0</version>
                    <version>Lustre 2.4.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>10</watches>
                                                                            <comments>
                            <comment id="50058" author="kitwestneat" created="Mon, 7 Jan 2013 10:25:41 +0000"  >&lt;p&gt;There is a 41MB (800MB uncompressed) client debug log from the event. Would it be useful to upload it somewhere?&lt;/p&gt;</comment>
                            <comment id="50061" author="pjones" created="Mon, 7 Jan 2013 11:01:05 +0000"  >&lt;p&gt;Thanks for the ticket Kit.&lt;/p&gt;</comment>
                            <comment id="50196" author="adilger" created="Wed, 9 Jan 2013 05:28:00 +0000"  >&lt;p&gt;Kit, is this only happening when trying to copy a 1-stripe file that was recently created?  Is the file created on the same node or a remote node?  What does &quot;stat&quot; report on the file before it is copied?&lt;/p&gt;</comment>
                            <comment id="50232" author="kitwestneat" created="Wed, 9 Jan 2013 16:29:38 +0000"  >&lt;p&gt;Here is the stat, stracing the cp reports the same:&lt;br/&gt;
  File: `/work/ws/jl95/simulation/OSC/OSC_traps_temperatur2_kopie/calc_series_OSC_multi_neu&apos;&lt;br/&gt;
  Size: 12899           Blocks: 1          IO Block: 2097152 regular file&lt;br/&gt;
Device: 24ce0ea6h/617483942d    Inode: 144115253070874816  Links: 1&lt;br/&gt;
Access: (0755/-rwxr-xr-x)  Uid: (28181/    jl95)   Gid: (11700/    jl00)&lt;br/&gt;
Access: 2013-01-09 13:28:34.000000000 +0100&lt;br/&gt;
Modify: 2013-01-09 13:28:34.000000000 +0100&lt;br/&gt;
Change: 2013-01-09 13:28:34.000000000 +0100&lt;br/&gt;
 Birth: -&lt;/p&gt;

&lt;p&gt;Response from customer:&lt;/p&gt;

&lt;p&gt;&amp;gt; is this only happening when trying to copy a 1-stripe file that was recently created? &lt;br/&gt;
Yes, it seems. The file was recently created and has stripe count 1. (The file system  &lt;br/&gt;
has defaultr stripe count 4 but since the user creates so many files I advised him to  &lt;br/&gt;
use stripe count 1.  &lt;/p&gt;

&lt;p&gt;&amp;gt;  Is the file created on the same node or a remote node?  &lt;br/&gt;
The file is created on a local disk of the same node.  &lt;/p&gt;

&lt;p&gt;&amp;gt; What does &quot;stat&quot; report on the file before it is copied?  &lt;br/&gt;
I just asked the user to run his reproducer again and create that data. (It might happen that  &lt;br/&gt;
with the stat the problem might not appear. With a &quot;sleep 5&quot; the problem does not appear most  &lt;br/&gt;
of the times. After the copy we check the source and the copied file with md5sum and when the  &lt;br/&gt;
problem appears the md5sum is different.) &lt;/p&gt;
</comment>
                            <comment id="50684" author="kitwestneat" created="Thu, 17 Jan 2013 12:16:31 +0000"  >&lt;p&gt;stats and md5sums of both good and bad cps. In the bad cp, the file only reports 1 used block. In the good one, the file reports 32 blocks. straces confirm that&apos;s what is seen by cp. &lt;/p&gt;</comment>
                            <comment id="50743" author="adilger" created="Thu, 17 Jan 2013 19:38:59 +0000"  >&lt;p&gt;What version of fileutils is in use here?  Was it part of the distro, or upgraded afterward?&lt;/p&gt;</comment>
                            <comment id="50970" author="kitwestneat" created="Tue, 22 Jan 2013 08:52:19 +0000"  >&lt;p&gt;From KIT: They are using normal fileutils (which are part of coreutils) of the SLES11 SP2 distribution. coreutils version is 8.12 and release is 6.23.1. (Source RPM is coreutils-8.12-6.23.1.src.rpm)&lt;/p&gt;</comment>
                            <comment id="52766" author="kalpak" created="Wed, 20 Feb 2013 16:03:28 +0000"  >&lt;p&gt;I don&apos;t think this issue is related to FIEMAP. stat reported st_blocks=1 for the file and a size of 12899 bytes. So cp correctly called the FIEMAP ioctl. &lt;/p&gt;

&lt;p&gt;The problem seems to be Lustre reporting wrong number of blocks on a recently created/written file. This fix leads stat to report st_blocks=1 instead of 0 - &lt;a href=&quot;http://git.whamcloud.com/?p=fs/lustre-release.git;a=commitdiff;h=829845ac9ddbdfd170de215742c033ea1102db3e;hp=fc4b46df111bbf9d2207265d18b3f0d72f49502c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://git.whamcloud.com/?p=fs/lustre-release.git;a=commitdiff;h=829845ac9ddbdfd170de215742c033ea1102db3e;hp=fc4b46df111bbf9d2207265d18b3f0d72f49502c&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="52788" author="kalpak" created="Thu, 21 Feb 2013 02:53:36 +0000"  >&lt;p&gt;Further regarding the ftruncate that we see in the strace (instead of the fseek that I was expecting) - even though Lustre says st_blocks=1, fiemap ioctl says that no blocks are allocated leading to the ftruncate call with the size of the file. &lt;/p&gt;

&lt;p&gt;On SLES11 SP2 with coreutils-8.12-6.19.1, looks like cp is always setting the FIEMAP_FLAG_SYNC flag as well.&lt;/p&gt;</comment>
                            <comment id="52822" author="adilger" created="Thu, 21 Feb 2013 14:10:07 +0000"  >&lt;p&gt;Kalpak, AFAIK the st_blocks value is only used to determine whether the file is sparse (st_blocks &amp;lt; st_size / 512) or dense (st_blocks &amp;gt;= st_size / 512).  For dense files they are copied via &quot;while (read() &amp;gt; 0) write()&quot;, and for sparse files newer &quot;cp&quot; copies only the list of extents returned by FIEMAP.  In both cases, my understanding is that st_blocks is not used for determining how much data is copied.&lt;/p&gt;

&lt;p&gt;The problem, as I see it, is that Lustre FIEMAP (which only returns something useful to &quot;cp&quot; for single-striped files) does not return FIEMAP_EXTENT_DELALLOC extents for pages that are only in the client cache and not on the OST yet.  &quot;cp&quot; should be using FIEMAP_FLAG_SYNC and causing all of the cached extents to be flushed, but somehow this isn&apos;t happening.&lt;/p&gt;</comment>
                            <comment id="53134" author="pjones" created="Wed, 27 Feb 2013 16:11:36 +0000"  >&lt;p&gt;Could you please clarify as to what versions of Lustre (and any patches running) that are being used here? You mention that it is Lustre 2.1.2 servers but what version of Lustre is being used on the client? &lt;/p&gt;</comment>
                            <comment id="53279" author="kalpak" created="Mon, 4 Mar 2013 13:39:27 +0000"  >&lt;p&gt;Peter, the clients are running Lustre 2.3 on SLES11 SP2.&lt;/p&gt;</comment>
                            <comment id="53283" author="pjones" created="Mon, 4 Mar 2013 14:14:59 +0000"  >&lt;p&gt;Thanks Kalpak. With any patches applied?&lt;/p&gt;</comment>
                            <comment id="53321" author="kalpak" created="Tue, 5 Mar 2013 04:45:41 +0000"  >&lt;p&gt;Update from KIT: With Lustre 2.3.0 on the client and patches 4477 and 4659 from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2367&quot; title=&quot;Hang in osc_extent_wait under fsync&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2367&quot;&gt;&lt;del&gt;LU-2367&lt;/del&gt;&lt;/a&gt; and &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2286&quot; title=&quot;Test failure on test suite parallel-scale, subtest test_write_append_truncate&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2286&quot;&gt;&lt;del&gt;LU-2286&lt;/del&gt;&lt;/a&gt;, the issue cannot be reproduced.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2367&quot; title=&quot;Hang in osc_extent_wait under fsync&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2367&quot;&gt;&lt;del&gt;LU-2367&lt;/del&gt;&lt;/a&gt; fixes a race in unplugging the IO queue which can affect flush and fsync - and &quot;cp&quot; always calls FIEMAP with the SYNC flag set causing the cached extents to be flushed. &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2286&quot; title=&quot;Test failure on test suite parallel-scale, subtest test_write_append_truncate&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2286&quot;&gt;&lt;del&gt;LU-2286&lt;/del&gt;&lt;/a&gt; fixes a bug where an extent does not get flushed to disk until the next write to the file occurs. So it does seem logical that the issue is not being reproduced with these 2 patches applied.&lt;/p&gt;</comment>
                            <comment id="53334" author="pjones" created="Tue, 5 Mar 2013 11:29:27 +0000"  >&lt;p&gt;ok thanks for the update. That explains why we have been unable to reproduce this issue on the latest 2.4 code. I will close out this ticket.&lt;/p&gt;</comment>
                            <comment id="54984" author="spitzcor" created="Thu, 28 Mar 2013 04:07:49 +0000"  >&lt;p&gt;This bug is still applicable to 2.1 when using cp built from coreutils 8.12, right?  &lt;span class=&quot;error&quot;&gt;&amp;#91;I can&amp;#39;t confirm that, but I think we&amp;#39;re seeing this on 2.2&amp;#93;&lt;/span&gt;  If so and since b2_1 is still the current maintenance branch, do we want to land a fix there?&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="11172">LU-417</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="18519">LU-3219</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="12169" name="stats_and_md5sums.txt" size="3025" author="kitwestneat" created="Thu, 17 Jan 2013 12:16:31 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzveon:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6020</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>