<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:49:09 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12040] File lost during recovery</title>
                <link>https://jira.whamcloud.com/browse/LU-12040</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;In 2011y, Johann was introduce a wire protocol changes.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;commit f90abfdc961debae069804307dcbc883b50c137c
Author: Johann Lombardi &amp;lt;johann@whamcloud.com&amp;gt;
Date:   Thu Dec 15 01:00:00 2011 +0100

   LU-169 lov: add generation number to LOV EA
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This commit replace an unused field &apos;stripe_offset&apos; in server reply with layout generation.&lt;br/&gt;
so same offset in this structure have a different purpose.&lt;br/&gt;
But, author forget about replay case. in this case, LOV EA directly copied from Server reply into client source buffer. Before this change it was LOV_DEFAULT_OFFSET aka -1, but after this change it&apos;s replaced with zero for new allocated file.&lt;br/&gt;
PFL landing have improve a layout checks for &apos;SETSTRIPE&apos; requests, so it verified against a pool indexes now.&lt;br/&gt;
&amp;#8211; &lt;br/&gt;
That&apos;s all. &lt;/p&gt;

&lt;p&gt;Client create a file in directory with pool assigned, but server failed. Client tries a resend open+create call but it have silence failed on replay with EINVAL in lod_verify_v1v3 as &apos;0&apos; isn&apos;t part of lod pool indexes.&lt;/p&gt;</description>
                <environment></environment>
        <key id="55048">LU-12040</key>
            <summary>File lost during recovery</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="shadow">Alexey Lyashkov</assignee>
                                    <reporter username="shadow">Alexey Lyashkov</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Mon, 4 Mar 2019 08:02:18 +0000</created>
                <updated>Tue, 21 Jul 2020 11:15:59 +0000</updated>
                            <resolved>Tue, 10 Sep 2019 17:38:22 +0000</resolved>
                                    <version>Lustre 2.10.0</version>
                    <version>Lustre 2.11.0</version>
                    <version>Lustre 2.12.0</version>
                    <version>Lustre 2.13.0</version>
                                    <fixVersion>Lustre 2.13.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="243288" author="gerrit" created="Mon, 4 Mar 2019 14:41:45 +0000"  >&lt;p&gt;Vladimir Saveliev (c17830@cray.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/34369&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34369&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12040&quot; title=&quot;File lost during recovery&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12040&quot;&gt;&lt;del&gt;LU-12040&lt;/del&gt;&lt;/a&gt; tests: test replay for creation of pooled file&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 8d2cf3d0b8265eda1fa550b2fdfdba4d6a2b2ad7&lt;/p&gt;</comment>
                            <comment id="243289" author="gerrit" created="Mon, 4 Mar 2019 14:57:43 +0000"  >&lt;p&gt;Vladimir Saveliev (c17830@cray.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/34371&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34371&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12040&quot; title=&quot;File lost during recovery&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12040&quot;&gt;&lt;del&gt;LU-12040&lt;/del&gt;&lt;/a&gt; mdc: reset lmm-&amp;gt;lmm_stripe_offset in mdc_save_lovea&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 3f9442ac7685d7e017269edb1c80c1938c05a46b&lt;/p&gt;</comment>
                            <comment id="243290" author="vsaveliev" created="Mon, 4 Mar 2019 15:04:51 +0000"  >&lt;p&gt;&lt;a href=&quot;https://review.whamcloud.com/#/c/34370/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/34370/&lt;/a&gt;&#160;and&#160; &lt;a href=&quot;https://review.whamcloud.com/34371&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34371&lt;/a&gt;&#160;are two possible solutions for the problem.&#160; &lt;a href=&quot;https://review.whamcloud.com/34369&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34369&lt;/a&gt;&#160;is a test illustrating the issue.&lt;/p&gt;</comment>
                            <comment id="243297" author="adilger" created="Mon, 4 Mar 2019 18:50:26 +0000"  >&lt;p&gt;I would say that patch:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;commit 89693927f0b065d44fdc496f6b49539118570104
LU-8998 lod: accomodate to composite layout

Modify the LOD to make it support the composite layout:
    :
    :
    - Object allocation code is adjusted to not only check the used
      OSTs in this round of allocation, but also the used OSTs in      the existing layout components..
    
Reviewed-on: https://review.whamcloud.com/24823
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;is what triggered this to be a problem in newer releases, which was landed as v2_9_55_0-14-g8969392. &lt;/p&gt;</comment>
                            <comment id="246293" author="spitzcor" created="Wed, 24 Apr 2019 14:27:18 +0000"  >&lt;p&gt;Is &lt;a href=&quot;https://review.whamcloud.com/34371&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34371&lt;/a&gt; ready to land?  It has Code-Review +1 from Andreas Dilger and Mike Pershin, and Verified +1 from Jenkins and Maloo.&lt;/p&gt;</comment>
                            <comment id="246333" author="adilger" created="Wed, 24 Apr 2019 21:04:53 +0000"  >&lt;p&gt;This patch and the prerequisite patch are in the master-next branch and should probably land by next week, depending on how integration testing goes. &lt;/p&gt;</comment>
                            <comment id="246479" author="gerrit" created="Tue, 30 Apr 2019 03:35:23 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/34371/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34371/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12040&quot; title=&quot;File lost during recovery&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12040&quot;&gt;&lt;del&gt;LU-12040&lt;/del&gt;&lt;/a&gt; mdc: reset lmm-&amp;gt;lmm_stripe_offset in mdc_save_lovea&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: c872afa36ff5de5910f0f524f7a487982fa0c776&lt;/p&gt;</comment>
                            <comment id="246512" author="pjones" created="Tue, 30 Apr 2019 12:52:57 +0000"  >&lt;p&gt;Landed for 2.13&lt;/p&gt;</comment>
                            <comment id="247467" author="gerrit" created="Tue, 21 May 2019 18:55:36 +0000"  >&lt;p&gt;Minh Diep (mdiep@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/34919&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34919&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12040&quot; title=&quot;File lost during recovery&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12040&quot;&gt;&lt;del&gt;LU-12040&lt;/del&gt;&lt;/a&gt; mdc: reset lmm-&amp;gt;lmm_stripe_offset in mdc_save_lovea&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 497c0fad736c9365e83b6ee3f07b52571fe70c6c&lt;/p&gt;</comment>
                            <comment id="251733" author="adilger" created="Fri, 19 Jul 2019 23:15:45 +0000"  >&lt;p&gt;We&apos;re seeing continuous failures on replay-single test_134 when this patch is backported to b2_12.  Shadow, do you know if there are any other patches that this one depends on to work?&lt;/p&gt;</comment>
                            <comment id="252171" author="shadow" created="Mon, 29 Jul 2019 15:37:51 +0000"  >&lt;p&gt;Andreas,&lt;/p&gt;

&lt;p&gt;I think it something wrong with test env&lt;br/&gt;
&amp;gt;&amp;gt;&lt;br/&gt;
[ 8982.834142] LustreError: 17948:0:(lod_qos.c:1358:lod_alloc_specific()) Start index 0 not found in pool &apos;pool_134&apos;&lt;br/&gt;
&amp;gt;&amp;gt;&lt;br/&gt;
I have no idea why touch start to allocate a file with don&apos;t existent index in pool, but it unlikely related to this patch.&lt;/p&gt;</comment>
                            <comment id="252172" author="pfarrell" created="Mon, 29 Jul 2019 16:08:47 +0000"  >&lt;p&gt;That error with stripe offset actually seems &lt;b&gt;really&lt;/b&gt; likely to be related to this patch, which is changing how the stripe offset is handled...?&lt;/p&gt;</comment>
                            <comment id="252193" author="shadow" created="Mon, 29 Jul 2019 18:42:06 +0000"  >&lt;p&gt;Patrik,&lt;/p&gt;

&lt;p&gt;I&apos;m sorry, but this patch do nothing with &quot;how stripe offset is handled&quot; in create path. This patch affects a replay code path,&lt;br/&gt;
but if you open an console logs, you can see this error was returned &lt;em&gt;before&lt;/em&gt; recovery.&lt;br/&gt;
One operation do a object creation before recovery is &apos;touch&apos; which do nothing with setstripe operation and inherit an create info from pool assigned to the parent dir.&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
[ 8981.540879] Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno
[ 8981.873832] Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly
[ 8982.658245] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000
[ 8982.825716] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
[ 8982.834142] LustreError: 17948:0:(lod_qos.c:1358:lod_alloc_specific()) Start index 0 not found in pool &lt;span class=&quot;code-quote&quot;&gt;&apos;pool_134&apos;&lt;/span&gt;
[ 8983.012511] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1&lt;span class=&quot;code-quote&quot;&gt;&apos; &apos;&lt;/span&gt; /proc/mounts || &lt;span class=&quot;code-keyword&quot;&gt;true&lt;/span&gt;
[ 8983.340608] Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
[ 8986.891724] LustreError: 137-5: lustre-MDT0000_UUID: not available &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; connect from 10.2.8.7@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
+
+       mkdir $DIR/$tdir
+       $LFS setstripe -p pool_134 $DIR/$tdir
+
+       replay_barrier mds1
+
+       touch $DIR/$tdir/$tfile &amp;lt;&amp;lt;&amp;lt; create
+
+       fail mds1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As I say before, if someone can replicate this with full debug info and attach this into ticket I can help with understanding a problem, but my view it&apos;s something bad with creating object with pool.&lt;/p&gt;</comment>
                            <comment id="254471" author="pjones" created="Tue, 10 Sep 2019 17:38:22 +0000"  >&lt;p&gt;This is fixed on master so this ticket should be marked RESOLVED. We should track any efforts to address this issue on 2.12.x separately.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="60052">LU-13809</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00cnb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>