<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:04:36 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-13832] &quot;lfs migrate -m&quot; leads to inconsistent ldiskfs directories</title>
                <link>https://jira.whamcloud.com/browse/LU-13832</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I created a test directory with striped DNE directories as follows:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# export MDSCOUNT=8
# export DIR=/mnt/testfs/allmdt
# lfs mkdir -c -1 $DIR
# for D in $(seq $MDSCOUNT); do
    lfs mkdir -c 2 $DIR/dirstr$D
    rsync -a --exclude &quot;policy.*&quot; /etc/ $DIR/dirstr$D/
done
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This created the test directories with a variety of files that can be verified.  Then, migrate each directory and verify the contents have not changed (the rsync should not report any files that need to be updated):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# for D in $(seq $MDSCOUNT); do
    echo $DIR/dirstr$D
    lfs migrate -m $((RANDOM % MDSCOUNT)) -c2 $DIR/dirstr$D
    rsync -av --exclude &quot;policy.*&quot; --dry-run /etc/ $DIR/dirstr$D/
done
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I ran this a couple of times, then ran e2fsck on the MDTs, and all of them showed the same problem on a lot of remote directories:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;e2fsck 1.45.2.wc1 (27-May-2019)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Directory entry for &apos;.&apos; in ... (25191) is big. Split? yes
Missing &apos;..&apos; in directory inode 25191. Fix? yes
Setting filetype for entry &apos;..&apos; in ... (25191) to 2.
:
Pass 3: Checking directory connectivity   [[[ WHEN NOT FIXING ]]]
&apos;..&apos; in /REMOTE_PARENT_DIR/0x200000407:0x6f5:0x0 (26203) is &amp;lt;The NULL inode&amp;gt; (0), should be /REMOTE_PARENT_DIR (25001).
Fix? no
[[[ OR ]]]
Pass 3: Checking directory connectivity  [[[ WHEN FIXING ]]]
Unconnected directory inode 25191 (/???)
Connect to /lost+found? yes
:
Pass 4: Checking reference counts
Inode 2 ref count is 0, should be 11.  Fix? yes

Inode 25191 ref count is 3, should be 2.  Fix? yes
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Looking at the directories under &lt;tt&gt;REMOTE_PARENT_DIR&lt;/tt&gt; it appears that the &quot;&lt;tt&gt;..&lt;/tt&gt;&quot; entry is missing from the directory, so &quot;&lt;tt&gt;.&lt;/tt&gt;&quot; is a single 4096-byte entry that consumes the whole block.  It may be that this hasn&apos;t been noticed in the past because these directories are all small and do not need to be split for HTREE, which would add a &quot;&lt;tt&gt;..&lt;/tt&gt;&quot; as part of &lt;tt&gt;struct dx_info&lt;/tt&gt;.&lt;/p&gt;</description>
                <environment></environment>
        <key id="60187">LU-13832</key>
            <summary>&quot;lfs migrate -m&quot; leads to inconsistent ldiskfs directories</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                    </labels>
                <created>Thu, 30 Jul 2020 08:37:04 +0000</created>
                <updated>Wed, 1 Jun 2022 15:34:36 +0000</updated>
                            <resolved>Wed, 1 Jun 2022 15:34:17 +0000</resolved>
                                    <version>Lustre 2.14.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="276363" author="adilger" created="Thu, 30 Jul 2020 08:43:49 +0000"  >&lt;p&gt;This is at least the most common issue that I saw.  It was present on all of the MDTs.&lt;/p&gt;

&lt;p&gt;I definitely saw a &lt;b&gt;LOT&lt;/b&gt; of problems when MDT0000 ran out of space (blocks) during migration, but that does not need to be the first problem fixed.&lt;/p&gt;

&lt;p&gt;When I didn&apos;t run out of space on MDT0000 the migrations worked mostly OK, but rsync was complaining about differences on many directories, even though I couldn&apos;t see what it was.  I suspect that it was the missing &quot;&lt;tt&gt;..&lt;/tt&gt;&quot; entry causing the difference, but I&apos;m not sure.&lt;/p&gt;</comment>
                            <comment id="276932" author="spitzcor" created="Fri, 7 Aug 2020 13:44:13 +0000"  >&lt;p&gt;This bug can be reproduced.  Also, is this a regression? (I mean since after the introduction of `lfs migrate -m` functionality.)&lt;/p&gt;

&lt;p&gt;Why is a back-end consistency issue only &quot;Minor&quot;?  If there isn&apos;t a good reason then it seems that we should raise the priority and target 2.14.0.  &lt;/p&gt;</comment>
                            <comment id="276977" author="adilger" created="Fri, 7 Aug 2020 19:57:57 +0000"  >&lt;p&gt;Cory, I can&apos;t say whether this is a newer regression or not. As for whether it is a 2.14 blocker depends on whether it was introduced in 2.13.5x patches, or if it has existed for a long time already. &lt;/p&gt;</comment>
                            <comment id="276978" author="adilger" created="Fri, 7 Aug 2020 20:00:20 +0000"  >&lt;p&gt;PS: so far this is not a data loss scenario, though the on-disk consistency is affected. From my brief testing, it appears that e2fsck fixes this issue. &lt;/p&gt;</comment>
                            <comment id="276980" author="spitzcor" created="Fri, 7 Aug 2020 20:43:57 +0000"  >&lt;p&gt;I can&apos;t answer the regression question yet either.  I&apos;m sure that we&apos;ll get an answer as we zero-in on root cause. FWIW, we&apos;ve seen this condition on a 2.12 LTS filesystem (albeit with some patches and back ports from 2.13.5x).&lt;/p&gt;</comment>
                            <comment id="277584" author="panda" created="Mon, 17 Aug 2020 12:08:20 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=laisiyao&quot; class=&quot;user-hover&quot; rel=&quot;laisiyao&quot;&gt;laisiyao&lt;/a&gt;, do I understand it correctly that your reproducer does not contain any failover or parallelism of any sort? The test looks linear with respect to mkdir/migrate.&lt;/p&gt;</comment>
                            <comment id="279601" author="panda" created="Tue, 15 Sep 2020 09:43:13 +0000"  >&lt;p&gt;I wonder if &lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=laisiyao&quot; class=&quot;user-hover&quot; rel=&quot;laisiyao&quot;&gt;laisiyao&lt;/a&gt; reproduced this issue with some old code. We were able to get the test that led to corruption in our case. It was simply&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
lfs setdirstripe -i 0 -c 2 /mnt/lustre/d
lfs migrate -m 0 /mnt/lustre/d
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Apparently, the issue was related to the fact that an empy dir did not receive LMV_HASH_FLAG_MIGRATION as part of migration. Eventually, mdt_dir_layout_shrink() was not able to complete migration and returned -EALREADY.&lt;/p&gt;

&lt;p&gt;This issue was silently fixed by&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
commit 3f608461b387df056c9563d4c2879b05fb54a5a5
Author: Lai Siyao &amp;lt;lai.siyao@whamcloud.com&amp;gt;
Date:   Sat Feb 15 21:26:36 2020 +0800

    LU-11025 dne: refactor dir migration
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="279663" author="laisiyao" created="Wed, 16 Sep 2020 02:04:35 +0000"  >&lt;p&gt;commit 3f608461b387df056c9563d4c2879b05fb54a5a5 does remove the optimization for empty directory migration, which is to simplify the code since empty directory should be rare.&lt;/p&gt;

&lt;p&gt;I haven&apos;t been able to reproduce yet, Andreas, are you testing with master branch?&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="64446">LU-14719</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i016in:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>