<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:16:25 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-8310] Checksum/erasure code of the EAs for better recovery of Lustre</title>
                <link>https://jira.whamcloud.com/browse/LU-8310</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;In order to save some private informations of the file/object belonging to a&lt;br/&gt;
Lustre file, Lustre saves a series of extended attributes on inodes of lower&lt;br/&gt;
file system. All of the EAs are important for correct behavior of Lustre&lt;br/&gt;
functions and features. And some of the EAs are so critical that if these EAs&lt;br/&gt;
are lost or corrupted, all the data/metadata of the Lustre file is no longer&lt;br/&gt;
available.  For example, if the &quot;trusted.lov&quot; EA has an incorrect value, the&lt;br/&gt;
data of the Lustre file might point to a non-exist object or even worse, to&lt;br/&gt;
another file&apos;s data.&lt;/p&gt;

&lt;p&gt;Unfortunately, this situation could happen if a server or storage crashes on&lt;br/&gt;
Lustre. And what makes the situation worse is that it is sometimes hard to&lt;br/&gt;
determine which component is the root cause of the inconsistency when&lt;br/&gt;
recovering the system. For example, a &quot;trusted.lov&quot; EA pointing to non-exist&lt;br/&gt;
object could means 1) the value of the EA is corrupted, or 2) the object on OST&lt;br/&gt;
has been removed although it shouldn&apos;t have. And when this happens, the LFSCK&lt;br/&gt;
mechanism which supporses to fix the inconsistency of Lustre file system online&lt;br/&gt;
might need to fix the problem based on wrong values of EAs. This attempt&lt;br/&gt;
obviously won&apos;t help.&lt;/p&gt;

&lt;p&gt;Because of these reasons, I am wondering whether a checksum/erasure code of the&lt;br/&gt;
Lustre EAs could be introduced to improve the situation. Following is the idea:&lt;/p&gt;

&lt;p&gt;1) An checksum/erasure code of the Lustre EAs (e.g. trusted.lov + trusted.lma&lt;br/&gt;
+ ...) will be calculated and saved as a new EA (e.g. &quot;trusted.mdt_checksum&quot;&lt;br/&gt;
and &quot;trusted.ost_checksum&quot;)  when the Lustre file is created. Since most of&lt;br/&gt;
(or all)the Lustre EAs will not be updated by normal file system operations on&lt;br/&gt;
the file, the EAs are almost immutable which means almost no performance&lt;br/&gt;
regression will be introduced (except maybe file creation).&lt;/p&gt;

&lt;p&gt;2) When the OST/MDT objects of a Lustre file is accessed/repaired, the&lt;br/&gt;
checksum/erasure code could be used to check (and fix if using erasure code)&lt;br/&gt;
the EAs.&lt;/p&gt;

&lt;p&gt;3) When the Lustre EAs are updated, the checksum/erasure code will be updated.&lt;br/&gt;
As said before, this won&apos;t happen frequently. And if some Lustre EAs change&lt;br/&gt;
too frequently (e.g. trusted.hsm when HSM is under heavy use), we could&lt;br/&gt;
exclude those EAs from the checksum. Thus, filter flags could be specified to&lt;br/&gt;
include only part of the Lustre EAs.&lt;/p&gt;

&lt;p&gt;4) The checksum/erasure code of the MDT EA (i.e. &quot;trusted.mdt_checksum&quot;) will&lt;br/&gt;
also be saved on OST objects that belongs to the same Lustre file. In this way,&lt;br/&gt;
LFSCK could use the checksum to check the consistency of the file between OSTs&lt;br/&gt;
and MDT. If checksum/erasure code of the MDT EA is inconsistent between MDT and&lt;br/&gt;
OSTs, the LFSCK needs to either smartly determine which one is broken or just&lt;br/&gt;
leave it along to manual decision. And ideally, this file should becomes&lt;br/&gt;
readonly to prevent any further corruption.&lt;/p&gt;

&lt;p&gt;5) A series of ultilities should be provided for better recovering of the&lt;br/&gt;
Lustre files, including checksum/erasure code of EAs. Given the fact that&lt;br/&gt;
Lustre is so complex, and is still evolving rapidly, it is ideal but not&lt;br/&gt;
currently ture that LFSCK is able to fix all of the problems online without&lt;br/&gt;
any manual intervention. It is not a rare condition that the Lustre file&lt;br/&gt;
system needs to be recovered offline directly on lower file system (i.e.&lt;br/&gt;
ldiskfs/zfs). And the checksum/erasure code of EAs would make it harder to fix&lt;br/&gt;
a broken file offline since the changing values of the EAs needs to be&lt;br/&gt;
consistent with the checksum/erasure code. A lot of tools and scripts should&lt;br/&gt;
be provided for this purpose even if LFSCK is doing well, because, as have&lt;br/&gt;
been proven, userspace tools are much more flexible than online mechanism when&lt;br/&gt;
recovering data. Also, for online recover, LFSCK should provide interfaces&lt;br/&gt;
to administrators to make decisions manually on the recovering of the file&lt;br/&gt;
system.&lt;/p&gt;

&lt;p&gt;We could use similar mechanism from lower file system, for example, the&lt;br/&gt;
metadata checksum of ext4. However, the Lustre level checksum of EAs still has&lt;br/&gt;
some advantages. First of all, the selected Lustre EAs are almost constant,&lt;br/&gt;
that means the performance regression is likely to be minimum. And also, this&lt;br/&gt;
implementation won&apos;t depend on any internal feature of the lower file system,&lt;br/&gt;
and thus it can be used on both ZFS and ldiskfs.&lt;/p&gt;</description>
                <environment></environment>
        <key id="37692">LU-8310</key>
            <summary>Checksum/erasure code of the EAs for better recovery of Lustre</summary>
                <type id="2" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11311&amp;avatarType=issuetype">New Feature</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="pjones">Peter Jones</assignee>
                                    <reporter username="lixi">Li Xi</reporter>
                        <labels>
                    </labels>
                <created>Tue, 21 Jun 2016 06:53:08 +0000</created>
                <updated>Fri, 26 Aug 2016 18:30:26 +0000</updated>
                            <resolved>Fri, 26 Aug 2016 18:30:26 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>1</watches>
                                                                            <comments>
                            <comment id="163281" author="pjones" created="Fri, 26 Aug 2016 17:12:15 +0000"  >&lt;p&gt;This seems to be a duplicate of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8309&quot; title=&quot; Checksum/erasure code of EAs to improve recovery of Lustre files&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8309&quot;&gt;&lt;del&gt;LU-8309&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzyf5b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>