<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:49:04 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12031] DoM/HSM: hsm_release fails after hsm_restore</title>
                <link>https://jira.whamcloud.com/browse/LU-12031</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;&#160;There is an issue when releasing a file striped with DoM after an hsm_restore.&lt;/p&gt;

&lt;p&gt;To reproduce:&lt;/p&gt;

&lt;p&gt;1) create a file with a 1st component on MDT:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lfs setstripe -E 1M -L mdt -E -1 -S 4M -c -1 /mnt/lustre/domfile&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;2) archive and release the file (requires HSM set up)&lt;br/&gt;
&#160;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lfs hsm_archive /mnt/lustre/domfile
# (wait for archive to complete)
lfs hsm_release&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;3) restore the file&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lfs hsm_restore /mnt/lustre/domfile
# or cat /mnt/lustre/domfile&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;4) release the file =&amp;gt; FAILS&#160;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lfs hsm_release /mnt/lustre/domfile

Cannot send HSM request (use of /mnt/lustre/domfile): Device or resource busy&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&#160;&lt;br/&gt;
It may be something wrong with the data version stored in hsm EA.&lt;/p&gt;</description>
                <environment></environment>
        <key id="55024">LU-12031</key>
            <summary>DoM/HSM: hsm_release fails after hsm_restore</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="tappro">Mikhail Pershin</assignee>
                                    <reporter username="cealustre">CEA</reporter>
                        <labels>
                            <label>CEA</label>
                            <label>DoM</label>
                            <label>HSM</label>
                    </labels>
                <created>Thu, 28 Feb 2019 13:03:30 +0000</created>
                <updated>Wed, 7 Feb 2024 23:28:55 +0000</updated>
                            <resolved>Wed, 19 Jul 2023 13:10:59 +0000</resolved>
                                    <version>Lustre 2.12.0</version>
                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>18</watches>
                                                                            <comments>
                            <comment id="243041" author="pjones" created="Thu, 28 Feb 2019 15:56:40 +0000"  >&lt;p&gt;Mike&lt;/p&gt;

&lt;p&gt;Can you please advise?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="243224" author="tappro" created="Sat, 2 Mar 2019 16:32:45 +0000"  >&lt;p&gt;yes, there is problem with data version mismatch. I will investigate&lt;/p&gt;</comment>
                            <comment id="269260" author="tappro" created="Mon, 4 May 2020 20:04:20 +0000"  >&lt;p&gt;This issue is bigger than it seems at first sign. The initial problem was about DOM data version mismatch after restore. When file with DoM stripe is being restored, the HSM XATTR stores data version but the same &lt;tt&gt;setxattr&lt;/tt&gt; operation changes inode version which is used as data version for MDT stripe as well. So it is just impossible to store current data version for DOM file because of that. &lt;br/&gt;
 The proposed solution is:&lt;br/&gt;
 1. if there is next component after DoM one then don&apos;t restore DoM stripe but delete it. The data will go to the next component. This makes sense because if the next component is used already then DoM stripe lost most of benefits and there is no big sense in having it.&lt;br/&gt;
 2. if DoM stripe is the only stripe being used then it is worth to keep it. In that case such file can be considered as non-released always. In general that is OK because they consume not much space. The question is - would that be handled nicely by HSM software? I suppose that should be OK at least because we can set &lt;tt&gt;norelease&lt;/tt&gt; flag on selected files, technically that is the same.&lt;/p&gt;

&lt;p&gt;Meanwhile, working on that issue I have found another one related to VOLATILE file handling. Usually such files are used as temporary file to copy data and then swap layout with original file, e.g. during HSM release and restore. The problem is that such files they can be created with DoM layout which cannot be swapped because we swap layouts but not inodes data. Such VOLATILE files can get layout from striping saved during archive operation or from default layout. In any case, it is unsafe to create VOLATILE file with DoM layout and that should be prohibited. &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13515&quot; title=&quot;prohibit DOM layout for VOLATILE file&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13515&quot;&gt;LU-13515&lt;/a&gt; was created for that&lt;/p&gt;</comment>
                            <comment id="329729" author="beevans" created="Mon, 21 Mar 2022 14:37:53 +0000"  >&lt;p&gt;Would adding &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13384&quot; title=&quot;HSM copytool API for external coordinator&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13384&quot;&gt;LU-13384&lt;/a&gt; and performing Restore-&amp;gt;Archive (without datacopy)-&amp;gt;Release fix most of this all in Userspace?&#160; We would regenerate the correct version during the fake archive step and be able to release cleanly.&lt;/p&gt;

&lt;p&gt;This probably won&apos;t address the DoM stripe issue, changing HSM to use file layouts would probably be the best way to handle that.&lt;/p&gt;</comment>
                            <comment id="330787" author="tappro" created="Fri, 1 Apr 2022 06:53:19 +0000"  >&lt;p&gt;Ben, yes, version could be corrected by userspace&#160; tools, though I wouldn&apos;t call that as a fix.&#160;&lt;/p&gt;

&lt;p&gt;While I proposed possible solution above with non-released DoM component and have patch for it, I am not confident in it. It looks &apos;hacky&apos; in all that layout manipulations just to avoid wrong data version.&#160;The correct solution would be separated &lt;tt&gt;&apos;data_version&apos;&lt;/tt&gt; maintained for DoM file in addition to &lt;tt&gt;inode_version&lt;/tt&gt;. I am figuring out how difficult that could be&lt;/p&gt;</comment>
                            <comment id="331039" author="tappro" created="Tue, 5 Apr 2022 12:01:45 +0000"  >&lt;p&gt;well, it looks like having separate &lt;tt&gt;data_version&lt;/tt&gt; is not the solution here. The main problem is a restore process which is using volatile file for data transfer. When transfer is completed the file data version is calculated including DoM stripe version and is stored in HSM extended attribute. The next step is layout swap between volatile file and original one which cannot keep the same DoM stripe because it is an inode data. So after layout swap&#160; data versions of DoM stripe and file itself are always different from the one stored in HSM xattr which is just copied to the original file from volatile one. I can&apos;t find the way how this could be done correctly without compromising whole process. We can&apos;t just allow DoM stripe data version be different because that would hide possible real problem by that. From that moment any further attempts to avoid that looks not less &apos;hacky&apos; than the solution proposed initially.&#160;&lt;/p&gt;

&lt;p&gt;The another idea could be different approach to calculating data_version for DoM stripe, e.g. make it not transaction-based but content-based, like checksum of DoM data, considering it is not big in size and that &lt;tt&gt;data_version&lt;/tt&gt; is being used by HSM mostly, so quite rare operation to affect performance. Also that means we don&apos;t need to store it anywhere which is also pros, since separate &apos;data_version&apos; can be stored as new XATTR only&lt;/p&gt;</comment>
                            <comment id="331087" author="adilger" created="Tue, 5 Apr 2022 16:26:09 +0000"  >&lt;p&gt;Mike, in case it is helpful to you, newer ext4 code has a &quot;swap data&quot; operation that is meant to allow swapping a &quot;volatile&quot; file into the boot loader inode. This could be used to swap data between two DoM files if needed. &lt;/p&gt;

&lt;p&gt;That said, your recent comments indicate that it isn&apos;t the DoM data swap that is the main obstacle, but the ordering problem of the data version. IMHO, a content-based hash is probably still too expensive if the data version is used regularly. That would make inode operations that need 1KB/inode into data operations that need (possibly) 1MB/inode, or at least 64KB/inode. There was some discussion recently on whether the data version should be used for NFS file modification tracking, so doing a DoM checksum on every file access would be punishing. Storing a separate xattr would be &lt;b&gt;much&lt;/b&gt; more efficient. &lt;/p&gt;

&lt;p&gt;Maybe I&apos;m missing something, but is it not possible to store the &quot;original&quot; object version in the swapped MDT inode?  This might mess with recovery, but if the volatile file is gone it would be pretty clear that the layout swap could not be replayed in any case.  We could also special-case the replay operation for layout swap to take this into consideration. &lt;/p&gt;</comment>
                            <comment id="331100" author="beevans" created="Tue, 5 Apr 2022 17:30:23 +0000"  >&lt;p&gt;I think it&apos;s much more sinister than that, in a non-DoM case, the data_version is calculated on each portion of the file (all on OSTs) then combined into a single data version and written to an XATTR on the MDT.&#160; For DoM, the act of writing the HSM data_version to an XATTR would cause the data_version on the MDT to change.&#160; Unless we can predict what the &quot;next&quot; DoM data_version is, so that the HSM XATTR agrees with the calculated data_version after the XATTR is written.&#160; So for a restore-&amp;gt;release case it will &lt;b&gt;always&lt;/b&gt; be wrong.&lt;/p&gt;</comment>
                            <comment id="331105" author="tappro" created="Tue, 5 Apr 2022 18:20:20 +0000"  >&lt;p&gt;Ben, unlike &lt;tt&gt;inode_version&lt;/tt&gt; the &lt;tt&gt;data_version&lt;/tt&gt; is not changed by xattr set, that is why I was trying to introduce it. Like on OST it would be changed only on data change - write, truncate and fallocate. So that solves problems when metadata operations affects &lt;tt&gt;data_version&lt;/tt&gt; though required separated xattr to store it.&lt;/p&gt;</comment>
                            <comment id="331108" author="tappro" created="Tue, 5 Apr 2022 18:40:43 +0000"  >&lt;p&gt;but what is actually bad about DoM release/restore - it is the fact that DoM stripe is not actually archived and is not restored after all though it &apos;looks&apos; so. On archive operation it is read and stored in atchive but unlike OST object the data in inode is not truncated and stay untouched. Upon restore DoM data is read from archive and is written to volatile file inode. But on swap layout it is gone along with volatile file actually and original data in original inode become just visible as layout says it exists. So all that time DoM data stays in inode and its copy in archive is just lost along with volatile file. That means there is no any sense to archive what is always kept in inode on disk. Therefore I tend to return back to first solution when DoM stripe is either non-released or just removed in favor of first ost stripe if exists&lt;/p&gt;</comment>
                            <comment id="331112" author="beevans" created="Tue, 5 Apr 2022 19:03:47 +0000"  >&lt;p&gt;It would be nice if instead of all the playing around with temp files we could just restore to a stripe, and once completed mark it as primary.&#160; We should also be able to restore all the other layout information as well and mark them as secondary.&lt;/p&gt;</comment>
                            <comment id="331130" author="adilger" created="Tue, 5 Apr 2022 21:24:01 +0000"  >&lt;p&gt;Ben - see &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9961&quot; title=&quot;FLR2: Relocating objects to a new OST&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9961&quot;&gt;LU-9961&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="332942" author="gerrit" created="Tue, 26 Apr 2022 06:05:17 +0000"  >&lt;p&gt;&quot;Mike Pershin &amp;lt;mpershin@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47139&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47139&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12031&quot; title=&quot;DoM/HSM: hsm_release fails after hsm_restore&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12031&quot;&gt;&lt;del&gt;LU-12031&lt;/del&gt;&lt;/a&gt; mdt: explicit data version of DoM files&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 5dbeabb034654f258162f05722477645e5f2fffe&lt;/p&gt;</comment>
                            <comment id="332948" author="adilger" created="Tue, 26 Apr 2022 07:21:53 +0000"  >&lt;p&gt;Rather than storing yet another xattr on the DoM inode in this case (which might have issues with backup/restore, etc.), what about just &lt;b&gt;not&lt;/b&gt; updating i_version on setxattr from HSM restore (or resetting it to the pre-update i_version)?&lt;/p&gt;</comment>
                            <comment id="336462" author="gerrit" created="Tue, 31 May 2022 18:05:43 +0000"  >&lt;p&gt;&quot;Sergey Cheremencev &amp;lt;sergey.cheremencev@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47497&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47497&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12031&quot; title=&quot;DoM/HSM: hsm_release fails after hsm_restore&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12031&quot;&gt;&lt;del&gt;LU-12031&lt;/del&gt;&lt;/a&gt; mdt: proof of concept&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: b05ecdbcc682c1d2ef290110bd052bc3e0f2e61a&lt;/p&gt;</comment>
                            <comment id="376803" author="gerrit" created="Wed, 28 Jun 2023 21:47:14 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/47139/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/47139/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12031&quot; title=&quot;DoM/HSM: hsm_release fails after hsm_restore&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12031&quot;&gt;&lt;del&gt;LU-12031&lt;/del&gt;&lt;/a&gt; mdt: explicit data version of DoM files&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: aae3289adb2bbc192870f195b78044484f717e16&lt;/p&gt;</comment>
                            <comment id="379192" author="nangelinas" created="Tue, 18 Jul 2023 19:55:10 +0000"  >&lt;p&gt;Peter, is there any remaining work for this issue or can it be closed?&lt;/p&gt;</comment>
                            <comment id="379195" author="pjones" created="Tue, 18 Jul 2023 20:12:26 +0000"  >&lt;p&gt;There is still an unhanded patch tracked under this ticket - &lt;a href=&quot;https://review.whamcloud.com/#/c/fs/lustre-release/+/47497/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/fs/lustre-release/+/47497/&lt;/a&gt; - but I&apos;ll defer to &lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=scherementsev&quot; class=&quot;user-hover&quot; rel=&quot;scherementsev&quot;&gt;scherementsev&lt;/a&gt; as to whether that is needed...&lt;/p&gt;</comment>
                            <comment id="379245" author="sergey" created="Wed, 19 Jul 2023 05:14:30 +0000"  >&lt;p&gt;I&apos;ve abandoned &lt;a href=&quot;https://review.whamcloud.com/#/c/fs/lustre-release/+/47497/.&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/fs/lustre-release/+/47497/&lt;/a&gt;. Agree to close if there is no known issues after landing 47139.&lt;/p&gt;</comment>
                            <comment id="379251" author="tappro" created="Wed, 19 Jul 2023 06:50:26 +0000"  >&lt;p&gt;there is nothing left on my side as well&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                                        </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="59042">LU-13515</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00chz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>