<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:44:52 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4675] LFSCK should mark files with holes so clients do not access them</title>
                <link>https://jira.whamcloud.com/browse/LU-4675</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;If an orphan object with stripe_index != 0 is linked to a recreated MDS inode in &lt;a href=&quot;http://review.whamcloud.com/7810&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7810&lt;/a&gt;, but not all of the objects are present (e.g. some of the stripes of that file were lost, but a non-zero stripe_index orphan remained) the client will crash if the file is accessed (e.g. &quot;ls -l&quot;):&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 19393:0:(ldlm_resource.c:1077:ldlm_resource_get()) ASSERTION( name-&amp;gt;name[0] != 0 ) failed: 
LustreError: 19393:0:(ldlm_resource.c:1077:ldlm_resource_get()) LBUG
Pid: 19393, comm: ls

Call Trace:
 [&amp;lt;ffffffffa0ef9895&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [&amp;lt;ffffffffa0ef9e97&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
 [&amp;lt;ffffffffa07c4f20&amp;gt;] ldlm_resource_get+0x700/0x900 [ptlrpc]
 [&amp;lt;ffffffffa07bf1b9&amp;gt;] ldlm_lock_create+0x59/0xcc0 [ptlrpc]
 [&amp;lt;ffffffffa07d8314&amp;gt;] ldlm_cli_enqueue+0xa4/0x790 [ptlrpc]
 [&amp;lt;ffffffffa09ebd44&amp;gt;] osc_enqueue_base+0x1e4/0x5b0 [osc]
 [&amp;lt;ffffffffa0a082fd&amp;gt;] osc_lock_enqueue+0x1ed/0x8c0 [osc]
 [&amp;lt;ffffffffa105be7c&amp;gt;] cl_enqueue_try+0xfc/0x300 [obdclass]
 [&amp;lt;ffffffffa0a5d42a&amp;gt;] lov_lock_enqueue+0x22a/0x850 [lov]
 [&amp;lt;ffffffffa105be7c&amp;gt;] cl_enqueue_try+0xfc/0x300 [obdclass]
 [&amp;lt;ffffffffa105d0cf&amp;gt;] cl_enqueue_locked+0x6f/0x1f0 [obdclass]
 [&amp;lt;ffffffffa105dd1e&amp;gt;] cl_lock_request+0x7e/0x270 [obdclass]
 [&amp;lt;ffffffffa123dba0&amp;gt;] cl_glimpse_lock+0x180/0x490 [lustre]
 [&amp;lt;ffffffffa123e415&amp;gt;] cl_glimpse_size0+0x1a5/0x1d0 [lustre]
 [&amp;lt;ffffffffa11eb55d&amp;gt;] ll_inode_revalidate_it+0x1cd/0x660 [lustre]
 [&amp;lt;ffffffffa11eba3a&amp;gt;] ll_getattr_it+0x4a/0x1b0 [lustre]
 [&amp;lt;ffffffffa11ebbd7&amp;gt;] ll_getattr+0x37/0x40 [lustre]
 [&amp;lt;ffffffff81186db1&amp;gt;] vfs_getattr+0x51/0x80
 [&amp;lt;ffffffff81186e40&amp;gt;] vfs_fstatat+0x60/0x80
 [&amp;lt;ffffffff81186ece&amp;gt;] vfs_lstat+0x1e/0x20
 [&amp;lt;ffffffff81186ef4&amp;gt;] sys_newlstat+0x24/0x50
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I&apos;m not sure what the right way to handle this is, since this would affect all old clients trying to access files in &lt;tt&gt;.lustre/lost+found&lt;/tt&gt; so fixing just the 2.6 client is not enough.  Either we need to backport the fix to 2.5.2 and 2.4.3 and 2.1.7 clients (not very good, since we aren&apos;t sure if the client has the fix), or use some other lmm_magic or lmm_pattern to ensure that unpatched clients will not understand it.&lt;/p&gt;

&lt;p&gt;In the second case (using a different lmm_magic or lmm_pattern, maybe LOV_PATTERN_F_SPARSE?) the lfsck_layout_extend_lovea() code would need to decide as stripes are added if the layout is sparse (set the flag, old clients cannot access) or if it is full (clear the flag, old clients can access).&lt;/p&gt;</description>
                <environment></environment>
        <key id="23329">LU-4675</key>
            <summary>LFSCK should mark files with holes so clients do not access them</summary>
                <type id="7" iconUrl="https://jira.whamcloud.com/images/icons/issuetypes/task_agile.png">Technical task</type>
                            <parent id="23439">LU-4701</parent>
                                    <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="yong.fan">nasf</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                    </labels>
                <created>Wed, 26 Feb 2014 23:45:08 +0000</created>
                <updated>Thu, 29 May 2014 03:41:59 +0000</updated>
                            <resolved>Thu, 29 May 2014 03:41:59 +0000</resolved>
                                    <version>Lustre 2.6.0</version>
                                    <fixVersion>Lustre 2.6.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="77972" author="adilger" created="Wed, 26 Feb 2014 23:51:14 +0000"  >&lt;p&gt;To be clear, I don&apos;t think that the LASSERT() should be removed, since object &lt;/p&gt;
{0,0}
&lt;p&gt; should never be accessed.  Rather, the LOV code should just skip such objects entirely, and return -EIO in such a case.  Please include a test case that creates such a file, and runs a number of different operations on it (stat, read, write, touch, chown, unlink) to make sure the different paths are covered.&lt;/p&gt;</comment>
                            <comment id="82120" author="yong.fan" created="Tue, 22 Apr 2014 04:26:58 +0000"  >&lt;p&gt;Here is the patch:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/10042&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10042&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="82121" author="yong.fan" created="Tue, 22 Apr 2014 04:28:37 +0000"  >&lt;p&gt;When the layout LFSCK repairs orphan OST-object, if the parent&lt;br/&gt;
MDT-object was lost, then it will re-create the MDT-object and&lt;br/&gt;
regenerate the LOV EA and fill the target LOV EA slot with the&lt;br/&gt;
orphan information, and fill other slots with zero (LOV hole);&lt;br/&gt;
if related LOV EA slot is invalid or hole, then it will refill&lt;br/&gt;
the target LOV EA slot; if the target slot exceeds current LOV&lt;br/&gt;
EA tail, then extend the LOV EA, and fill the gaps as zero.&lt;/p&gt;

&lt;p&gt;Some of the LOV EA holes may cannot be re-filled finally becuase&lt;br/&gt;
of lost some OST-objects. And even if they can be re-filled, but&lt;br/&gt;
there are still some possible race accessings from client before&lt;br/&gt;
the re-filling. If the client access the LOV EA with hole(s), it&lt;br/&gt;
may cause some strange behaviour, such as trigger LBUG()/LASSERT()&lt;br/&gt;
on the client.&lt;/p&gt;

&lt;p&gt;So we will make the client to be aware of the LOV EA is incomplete.&lt;br/&gt;
We introduce a new LOV EA pattern flag LOV_PATTERN_F_HOLE for that:&lt;br/&gt;
any time when the LFSCK repairs the LOV EA with hole(s), the LOV EA&lt;br/&gt;
will be marked as LOV_PATTERN_F_HOLE; when all the holes in the LOV&lt;br/&gt;
EA are refilled, the LOV_PATTERN_F_HOLE will be dropped.&lt;/p&gt;

&lt;p&gt;For a new client, it recongizes the pattern flag LOV_PATTERN_F_HOLE,&lt;br/&gt;
then it can permit/forbid some opertions on the file with LOV holes:&lt;/p&gt;

&lt;p&gt;1) getattr/getxattr opertions are permitted, such as stat/ls -l, and&lt;br/&gt;
   so on. The file size is the sum on the known stripes. So it gives&lt;br/&gt;
   the administrator chance to know how much data has been recovered.&lt;/p&gt;

&lt;p&gt;2) Normal read the file with LOV EA hole is not permitted to avoid&lt;br/&gt;
   the LOV EA holes cases to be hidden. Instead, the administrator&lt;br/&gt;
   can dump the data via new &quot;lfs dump&quot; tool.&lt;/p&gt;

&lt;p&gt;3) If the modification only changes MDS-side metadata, such as chmod,&lt;br/&gt;
   then it is permitted.&lt;/p&gt;

&lt;p&gt;4) unlink/rm the file which has LOV EA holes is permitted.&lt;/p&gt;

&lt;p&gt;5) For other modifications, if the modification will change something&lt;br/&gt;
   on OST side, such as write/touch/chown, then it will be denied.&lt;/p&gt;

&lt;p&gt;For a old client, since it will not recognize the new pattern flag&lt;br/&gt;
LOV_PATTERN_F_HOLE. So the LOV EA with holes will be dicarded with&lt;br/&gt;
failure, but it will not cause the client to be crashed.&lt;/p&gt;</comment>
                            <comment id="82144" author="jamesanunez" created="Tue, 22 Apr 2014 14:38:58 +0000"  >&lt;p&gt;I&apos;ve encountered this client crash while trying to track down what files were causing LFSCK to fail during phase 1 of the scan.&lt;/p&gt;</comment>
                            <comment id="82448" author="adilger" created="Fri, 25 Apr 2014 06:47:07 +0000"  >&lt;p&gt;To be clear, while &lt;a href=&quot;http://review.whamcloud.com/10042&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10042&lt;/a&gt; is fixing the problem of LFSCK creating layouts with holes in them, there is still the separate bug (&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4958&quot; title=&quot;do not crash accessing LOV object with FID {0, 0}&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4958&quot;&gt;&lt;del&gt;LU-4958&lt;/del&gt;&lt;/a&gt;) for clients crashing because of bad layout data from the network, regardless of whether that was caused by LFSCK or not. It still makes sense for the client to validate the layout (magic, pattern, objects) and return an error of they are bad. &lt;/p&gt;</comment>
                            <comment id="85099" author="yong.fan" created="Thu, 29 May 2014 03:41:59 +0000"  >&lt;p&gt;The master patch has been landed. The patches for old client will be done under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4958&quot; title=&quot;do not crash accessing LOV object with FID {0, 0}&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4958&quot;&gt;&lt;del&gt;LU-4958&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="24390">LU-4958</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwg1j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>12832</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>