<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:49:43 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-5239] Recovery of small files with corrupt objects</title>
                <link>https://jira.whamcloud.com/browse/LU-5239</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We had a backend storage issue on 5/30 that corrupted a number of blocks on the filesystem across different OSTs. Since then we were able to recover all filesystem structures with e2fsck and identify what we though were all the files. Just recently, we discovered a new scenario where inodes were corrupted, as so cleared by e2fsck. We have identified 665 of such files and an ls or stat returns &quot;Cannot allocate memory&quot;. Syslog has the error&lt;/p&gt;

&lt;p&gt;Jun 20 20:53:15 f1-oss1d5 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;853846.084587&amp;#93;&lt;/span&gt; LustreError: 14391:0:(ldlm_resource.c:1165:ldlm_resource_get()) f1-OST00bc: lvbo_init failed for resource 0xd4805:0x0: rc = -2&lt;/p&gt;

&lt;p&gt;This is  expected because object 0xd4805 on f1-OST00bc was is invalid (it&apos;s inode on f1-OST00bc was cleared by e2fsck).&lt;br/&gt;
gaea9:/tmp # lfs getstripe file.F90&lt;br/&gt;
lmm_stripe_count:   4&lt;br/&gt;
lmm_stripe_size:    1048576&lt;br/&gt;
lmm_stripe_offset:  186&lt;br/&gt;
        obdidx           objid          objid            group&lt;br/&gt;
           186          871222        0xd4b36                0&lt;br/&gt;
           187          870647        0xd48f7                0&lt;br/&gt;
           188          870405        0xd4805                0&lt;br/&gt;
           189          869971        0xd4653                0&lt;/p&gt;

&lt;p&gt;We would like to attempt recovery of small files &amp;lt;3MB (stripe size 4), where the layout might position the missing object after EOF. We thought a dd if=bad.file of=good.file would return success if EOF was reached before the missing object. However, this method fails with &quot;Cannot allocate memory&quot; even for small files, where dd reports only copying some number of kB.&lt;/p&gt;

&lt;p&gt;What is causing dd to fail to read even files less than 1MB where the bad object is in the 3rd object?&lt;br/&gt;
gaea9:/tmp # dd if=file.F90 of=/tmp/good.out &lt;br/&gt;
dd: reading `file.F90&apos;: Cannot allocate memory&lt;br/&gt;
33+0 records in&lt;br/&gt;
33+0 records out&lt;br/&gt;
16896 bytes (17 kB) copied, 0.0531134 s, 318 kB/s&lt;/p&gt;

&lt;p&gt;When opening good.out, it is not complete.&lt;/p&gt;

&lt;p&gt;Is there an alternative method to successfully read to EOF for small files?&lt;/p&gt;

&lt;p&gt;This is not causing a downtime, but it is desirable to recover these files as quickly as reasonably possible.&lt;/p&gt;</description>
                <environment>RHEL6.4/distro IB kernel 2.6.32-358.18.1.el6</environment>
        <key id="25255">LU-5239</key>
            <summary>Recovery of small files with corrupt objects</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="6">Not a Bug</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="blakecaldwell">Blake Caldwell</reporter>
                        <labels>
                    </labels>
                <created>Fri, 20 Jun 2014 20:59:32 +0000</created>
                <updated>Thu, 18 Sep 2014 19:28:24 +0000</updated>
                            <resolved>Thu, 18 Sep 2014 19:28:24 +0000</resolved>
                                    <version>Lustre 2.4.1</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="87222" author="pjones" created="Sat, 21 Jun 2014 02:27:45 +0000"  >&lt;p&gt;Bobijam&lt;/p&gt;

&lt;p&gt;Could you please advise with this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="87260" author="bobijam" created="Mon, 23 Jun 2014 03:19:08 +0000"  >&lt;p&gt;How can you be sure that the file EOF before 3M? If you can sure of that, would &quot;dd if=file.F90 of=/tmp/good.out bs=1M count=2&quot; work for it?&lt;/p&gt;</comment>
                            <comment id="87273" author="blakecaldwell" created="Mon, 23 Jun 2014 13:57:57 +0000"  >&lt;p&gt;That causes the same error &quot;Cannot allocate memory&quot;. By setting bs=1, the whole file (17333 bytes) can be recovered to EOF. WIth a block-size of 4, I could reproduce the error without reading the whole file. Is there an optimization where the client tries to read the next object even if EOF is reached on the first object?&lt;/p&gt;

&lt;p&gt;gaea9:/tmp # dd if=file.F90 of=/tmp/good.out bs=1&lt;br/&gt;
17333+0 records in&lt;br/&gt;
17333+0 records out&lt;br/&gt;
17333 bytes (17 kB) copied, 0.0607574 s, 285 kB/s&lt;/p&gt;


&lt;p&gt;gaea9:/tmp # dd if=file.F90 of=/tmp/good.out bs=4 count=4333&lt;br/&gt;
4333+0 records in&lt;br/&gt;
4333+0 records out&lt;br/&gt;
17332 bytes (17 kB) copied, 0.0237968 s, 728 kB/s&lt;br/&gt;
gaea9:/tmp # dd if=file.F90 of=/tmp/good.out bs=4 count=4334&lt;br/&gt;
dd: reading `file.F90&apos;: Cannot allocate memory&lt;br/&gt;
4333+0 records in&lt;br/&gt;
4333+0 records out&lt;br/&gt;
17332 bytes (17 kB) copied, 0.0223031 s, 777 kB/s&lt;/p&gt;
</comment>
                            <comment id="87276" author="bobijam" created="Mon, 23 Jun 2014 14:34:48 +0000"  >&lt;p&gt;from what you described, file.F90 only has 17333 bytes available to recovered. When block-size is set to 4, dd tries to read 4 bytes at a time, thus it can only succeed to read 4333 times, which covers 4333 * 4 = 17332 bytes, the last read reaches the missing object on OST00bc and it fails. And it also explains the dd command without setting bs parameter, whose default value is 512, in that case it reads 512*33=16996 bytes, and fails to read another 512 bytes which reaches the missing object on OST00bc as well.&lt;/p&gt;</comment>
                            <comment id="88124" author="blakecaldwell" created="Thu, 3 Jul 2014 18:03:28 +0000"  >&lt;p&gt;While we were ables to complete recovery of the files with bs=1, we weren&apos;t completely clear why reading 17336 bytes (4*4334) would return an error when 17332 bytes is fine. Lustre would have to know that the first object is 17332 bytes and that it needs to read 4 more bytes from the 2nd object. &lt;/p&gt;

&lt;p&gt;Why would it prefetch the 2nd object in the 17336 bytes case and not the 17332 bytes case?&lt;/p&gt;</comment>
                            <comment id="88167" author="bobijam" created="Fri, 4 Jul 2014 01:42:16 +0000"  >&lt;p&gt;I suspect it could involves with dd implementation, I don&apos;t know the detail though, but I guess dd does not even try to understand whether there is EOF in the last 4 bytes read request, and it just asks 4 bytes, and Lustre reaches the unavailable region and returns ENOENT for the request.&lt;/p&gt;</comment>
                            <comment id="94430" author="blakecaldwell" created="Thu, 18 Sep 2014 18:18:05 +0000"  >&lt;p&gt;This can be resolved. There is no practical reason to investigate this more and it definitely could be in dd implemenation, and using conv=sync could have helped out with the investigation. We were able to recover about half of the files using this technique (with bs=1) because the cleared block was after EOF.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwpmn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>14611</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>