<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:29:12 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2901] Duplicate filename on the same ldiskfs directory on MDS</title>
                <link>https://jira.whamcloud.com/browse/LU-2901</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We meet 3 times on 3 differents lustre MDS ldiskfs filesystem&lt;br/&gt;
a duplicate name file on a directory, for example:&lt;/p&gt;

&lt;p&gt;2012 Dec  4 10:59:22 bigfoot2 :&lt;/p&gt;

&lt;p&gt;cdep4-MDT0000    Starting fsck&lt;br/&gt;
cdep4-MDT0000    Pass 1: Checking inodes, blocks, and sizes&lt;br/&gt;
cdep4-MDT0000    Pass 2: Checking directory structure&lt;br/&gt;
cdep4-MDT0000    Duplicate entry &apos;BDE_UCD.00000-00001&apos; found.&lt;br/&gt;
cdep4-MDT0000        Marking /ROOT/yack/group/USER/test9 (836767265) to be rebuilt.&lt;br/&gt;
cdep4-MDT0000&lt;br/&gt;
cdep4-MDT0000    Duplicate entry &apos;BDE_UCD.00000-00001&apos; found.&lt;br/&gt;
cdep4-MDT0000        Marking /ROOT/yack/group/USER/test9_pscc (838872002) to be rebuilt.&lt;br/&gt;
cdep4-MDT0000&lt;br/&gt;
cdep4-MDT0000    Pass 3: Checking directory connectivity&lt;br/&gt;
cdep4-MDT0000    Pass 3A: optimizing directories&lt;br/&gt;
cdep4-MDT0000    Entry &apos;BDE_UCD.00000-00001&apos; in /ROOT/yack/group/USER/test9 (836767265) has a non-unique filename.&lt;br/&gt;
cdep4-MDT0000    Rename to BDE_UCD.00000-0000~0? yes&lt;br/&gt;
cdep4-MDT0000&lt;br/&gt;
cdep4-MDT0000    Entry &apos;BDE_UCD.00000-00001&apos; in /ROOT/yack/group/USER/test9_pscc (838872002) has a non-unique filename.&lt;br/&gt;
cdep4-MDT0000    Rename to BDE_UCD.00000-0000~0? yes&lt;br/&gt;
cdep4-MDT0000&lt;br/&gt;
cdep4-MDT0000    Pass 4: Checking reference counts &lt;br/&gt;
cdep4-MDT0000    Pass 5: Checking group summary information&lt;br/&gt;
cdep4-MDT0000&lt;br/&gt;
cdep4-MDT0000: ***** FILE SYSTEM WAS MODIFIED *****&lt;br/&gt;
cdep4-MDT0000: 2872950/878051328 files (0.8% non-contiguous), 111956436/878047232 blocks&lt;/p&gt;

&lt;p&gt;In this case, running the ls command we can see the issue:&lt;/p&gt;

&lt;p&gt;total 0&lt;br/&gt;
Fri Nov 30 16:20:22 +     0.00  ###############################################################################&lt;br/&gt;
Fri Nov 30 16:20:22 +     0.00  ##   Contenu du repertoire cache_dep /cea/cache_dep/yack/group/USER/test9_ ##&lt;br/&gt;
Fri Nov 30 16:20:22 +     0.00  ##  pscc                                                                     ##&lt;br/&gt;
Fri Nov 30 16:20:22 +     0.00  ###############################################################################&lt;br/&gt;
total 331844&lt;br/&gt;
&lt;del&gt;rw-r&lt;/del&gt;&lt;del&gt;r&lt;/del&gt;- 1 USER f7     10240 Nov 30 16:19 BDE_DIVERS&lt;br/&gt;
&lt;del&gt;rw-r&lt;/del&gt;---- 1 USER f7 140615680 Nov 30 16:09 BDE_MAILLAGE&lt;br/&gt;
&lt;del&gt;rw-r&lt;/del&gt;&lt;del&gt;r&lt;/del&gt;- 1 USER f7     10240 Nov 30 16:20 BDE_POST1D&lt;br/&gt;
&lt;del&gt;rw-r&lt;/del&gt;&lt;del&gt;r&lt;/del&gt;- 1 USER f7  99563008 Nov 30 16:16 BDE_UCD.00000-00001&lt;br/&gt;
&lt;del&gt;rw-r&lt;/del&gt;&lt;del&gt;r&lt;/del&gt;- 1 USER f7  99563008 Nov 30 16:16 BDE_UCD.00000-00001&lt;/p&gt;

&lt;p&gt;Then running debugfs after the fsck we can see that both 2 files&lt;br/&gt;
exist on 2 different inodes:&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@bigfoot2 ~&amp;#93;&lt;/span&gt;# /usr/lib/lustre/debugfs /dev/mapper/da1vg0_mdt&lt;br/&gt;
debugfs 1.42.3.wc3 (15-Aug-2012)&lt;br/&gt;
debugfs:  ls&lt;br/&gt;
 2  (12) .    2  (12) ..    11  (20) lost+found    589299713  (16) CONFIGS&lt;br/&gt;
 637534209  (16) OBJECTS    12  (20) lov_objid    13  (16) oi.16&lt;br/&gt;
 14  (12) fld    15  (16) seq_srv    16  (16) seq_ctl    17  (20) capa_keys&lt;br/&gt;
 627048449  (16) PENDING    643825665  (12) ROOT    18  (20) last_rcvd&lt;br/&gt;
 700448769  (20) REM_OBJ_DIR    19  (3852) CATALOGS&lt;br/&gt;
debugfs:  cd ROOT/yack/group/USER&lt;br/&gt;
debugfs:  cd test9_pscc&lt;br/&gt;
debugfs:  ls&lt;br/&gt;
 838872002  (28) .    643826507  (28) ..&lt;br/&gt;
 838873362  (28) BDE_UCD.00000-0000~0    838875198  (48) BDE_UCD.00000-00001&lt;br/&gt;
 838875615  (36) BDE_POST1D    838877439  (40) BDE_MAILLAGE&lt;br/&gt;
 838877482  (36) BDE_DIVERS    838877492  (3852) BDE_PROT_LAG.00001-00001&lt;br/&gt;
debugfs: &lt;br/&gt;
debugfs:  stat BDE_UCD.00000-0000~0&lt;br/&gt;
Inode: 838873362   Type: regular    Mode:  0644   Flags: 0x0&lt;br/&gt;
Generation: 2475899703    Version: 0x0000002b:1ecd9699&lt;br/&gt;
User:  3083   Group:  5214   Size: 0&lt;br/&gt;
File ACL: 0    Directory ACL: 0&lt;br/&gt;
Links: 1   Blockcount: 0&lt;br/&gt;
Fragment:  Address: 0    Number: 0    Size: 0&lt;br/&gt;
 ctime: 0x50b8cde4:00000000 &amp;#8211; Fri Nov 30 16:16:52 2012&lt;br/&gt;
 atime: 0x50bf189a:00000000 &amp;#8211; Wed Dec  5 10:49:14 2012&lt;br/&gt;
 mtime: 0x50b8cde4:00000000 &amp;#8211; Fri Nov 30 16:16:52 2012&lt;br/&gt;
crtime: 0x50b8cc66:aa4a5734 &amp;#8211; Fri Nov 30 16:10:30 2012&lt;br/&gt;
Size of extra inode fields: 28&lt;br/&gt;
Extended attributes stored in inode body:&lt;br/&gt;
  lma = &quot;00 00 00 00 00 00 00 00 71 be 71 17 02 00 00 00 4b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00&lt;br/&gt;
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0&lt;br/&gt;
0 00 00 00 00 &quot; (64)&lt;br/&gt;
  lma: fid=&lt;span class=&quot;error&quot;&gt;&amp;#91;0x21771be71:0x4b:0x0&amp;#93;&lt;/span&gt;&lt;br/&gt;
  lov = &quot;d0 0b d1 0b 01 00 00 00 4b 00 00 00 00 00 00 00 71 be 71 17 02 00 00 00 00 00 40 00 03 00 00 00 99 9f&lt;br/&gt;
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 13 00 00 00 18 ad 00 0&lt;br/&gt;
0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0d 02 00 00 c9 af 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00&lt;br/&gt;
00 00 53 02 00 00 &quot; (104)&lt;br/&gt;
  link = &quot;df f1 ea 11 01 00 00 00 3d 00 00 00 00 00 00 00 39 41 6c 65 70 68 6f 4d 00 25 00 00 00 02 17 59 a0 d1&lt;br/&gt;
00 00 34 fb 00 00 00 00 42 44 45 5f 55 43 44 2e 30 30 30 30 30 2d 30 30 30&lt;br/&gt;
30 31 &quot; (61)&lt;br/&gt;
BLOCKS:&lt;/p&gt;

&lt;p&gt;debugfs:  stat BDE_UCD.00000-00001&lt;br/&gt;
Inode: 838875198   Type: regular    Mode:  0644   Flags: 0x0&lt;br/&gt;
Generation: 2475897856    Version: 0x0000002b:1ec71356&lt;br/&gt;
User:  3083   Group:  5214   Size: 0&lt;br/&gt;
File ACL: 0    Directory ACL: 0&lt;br/&gt;
Links: 1   Blockcount: 0&lt;br/&gt;
Fragment:  Address: 0    Number: 0    Size: 0&lt;br/&gt;
 ctime: 0x50b8c8fd:00000000 &amp;#8211; Fri Nov 30 15:55:57 2012&lt;br/&gt;
 atime: 0x50bf18a0:00000000 &amp;#8211; Wed Dec  5 10:49:20 2012&lt;br/&gt;
 mtime: 0x50b8c8fd:00000000 &amp;#8211; Fri Nov 30 15:55:57 2012&lt;br/&gt;
crtime: 0x50b8c6c2:977f5a84 &amp;#8211; Fri Nov 30 15:46:26 2012&lt;br/&gt;
Size of extra inode fields: 28&lt;br/&gt;
Extended attributes stored in inode body:&lt;br/&gt;
  lma = &quot;00 00 00 00 00 00 00 00 ef 0a 4f 17 02 00 00 00 52 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00&lt;br/&gt;
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0&lt;br/&gt;
0 00 00 00 00 &quot; (64)&lt;br/&gt;
  lma: fid=&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2174f0aef:0x52:0x0&amp;#93;&lt;/span&gt;&lt;br/&gt;
  lov = &quot;d0 0b d1 0b 01 00 00 00 52 00 00 00 00 00 00 00 ef 0a 4f 17 02 00 00 00 00 00 40 00 03 00 00 00 59 27&lt;br/&gt;
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c2 00 00 00 9b a6 00 0&lt;br/&gt;
0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b9 01 00 00 08 ac 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00&lt;br/&gt;
00 00 db 02 00 00 &quot; (104)&lt;br/&gt;
  link = &quot;df f1 ea 11 01 00 00 00 3d 00 00 00 00 00 00 00 39 41 6c 65 70 68 6f 4d 00 25 00 00 00 02 17 59 a0 d1&lt;br/&gt;
00 00 34 fb 00 00 00 00 42 44 45 5f 55 43 44 2e 30 30 30 30 30 2d 30 30 30&lt;br/&gt;
30 31 &quot; (61)&lt;br/&gt;
BLOCKS:&lt;/p&gt;

&lt;p&gt;Another trace can be found attached to this ticket.&lt;/p&gt;

&lt;p&gt;Is this a known issue?&lt;/p&gt;</description>
                <environment></environment>
        <key id="17752">LU-2901</key>
            <summary>Duplicate filename on the same ldiskfs directory on MDS</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="dmoreno">Diego Moreno</reporter>
                        <labels>
                    </labels>
                <created>Mon, 4 Mar 2013 11:51:06 +0000</created>
                <updated>Tue, 20 Aug 2013 05:59:06 +0000</updated>
                            <resolved>Tue, 20 Aug 2013 05:59:06 +0000</resolved>
                                    <version>Lustre 2.3.0</version>
                    <version>Lustre 2.1.3</version>
                    <version>Lustre 2.1.6</version>
                    <version>Lustre 2.4.1</version>
                                    <fixVersion>Lustre 2.1.6</fixVersion>
                    <fixVersion>Lustre 2.4.1</fixVersion>
                    <fixVersion>Lustre 2.5.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>21</watches>
                                                                            <comments>
                            <comment id="53304" author="pjones" created="Mon, 4 Mar 2013 19:36:35 +0000"  >&lt;p&gt;Lai&lt;/p&gt;

&lt;p&gt;Could you please comment on this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="53922" author="laisiyao" created="Wed, 13 Mar 2013 12:09:11 +0000"  >&lt;p&gt;My understanding of this issue is:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;fsck found two dir entries with the same name, and renamed the second to a unique name.&lt;/li&gt;
	&lt;li&gt;however `ls &amp;lt;dir&amp;gt;` still lists two files with the same name, and the file with the unique name is not shown. Could you confirm whether this is done on MDS (mount device with &apos;ldiskfs&apos;) or on client? BTW could you `ls -i ...` to list ino for all files also?&lt;/li&gt;
	&lt;li&gt;while debugfs MDS device can see the correct result (fixed by fsck).&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Please correct me if I misunderstood, thanks.&lt;/p&gt;</comment>
                            <comment id="53989" author="laisiyao" created="Thu, 14 Mar 2013 03:54:37 +0000"  >&lt;p&gt;Could you list the version of e2fsprogs on your system?&lt;/p&gt;</comment>
                            <comment id="53996" author="laisiyao" created="Thu, 14 Mar 2013 05:01:56 +0000"  >&lt;p&gt;Do you know how file &apos;BDE_UCD.00000-00001&apos; is created? What operations has been done on this file? Is it created multiple times?&lt;/p&gt;</comment>
                            <comment id="54091" author="dmoreno" created="Fri, 15 Mar 2013 04:23:15 +0000"  >&lt;p&gt;The e2fsprogs version is: 1.42.3.wc3. This issue is difficult to reproduce/trace as it raises up from time to time and it&apos;s only detected when lfsck is run once a month. It comes from an internal IO library doing lots of IOs.&lt;/p&gt;

&lt;p&gt;From my understanding lfsck is fine and the issue is as follows:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;On a periodic lfsck run we see there is a duplicate filename entry. We cancel the lfsck doing nothing.&lt;/li&gt;
	&lt;li&gt;Then from a Lustre client, doing &apos;ls&apos;, we obviously see the same filename twice (lfsck is right!).&lt;/li&gt;
	&lt;li&gt;We re-run lfsck renaming one of the duplicate filenames.&lt;/li&gt;
	&lt;li&gt;Everything is fine again as there&apos;s no more duplicate filenames.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The question is: how could a duplicate filename arrive?&lt;/p&gt;</comment>
                            <comment id="54103" author="dmoreno" created="Fri, 15 Mar 2013 08:36:39 +0000"  >&lt;p&gt;I forgot to mention that both files have the same name and size but different attributes (ctime, mtime) and fid.&lt;/p&gt;</comment>
                            <comment id="54115" author="laisiyao" created="Fri, 15 Mar 2013 12:09:15 +0000"  >&lt;p&gt;One possible cause is that on the second create, it fails to lookup by filename, so that file with the same name is created twice. But it&apos;s weird that this happens with the same name &apos;BDE_UCD.00000-00001&apos; on two different directories: &apos;test9&apos; and &apos;test9_pscc&apos;. I&apos;ve tested with the same directory structure, but nothing went wrong.&lt;/p&gt;

&lt;p&gt;I&apos;ll see is it possible to make a debug patch to catch such case, and print message when it happens.&lt;/p&gt;</comment>
                            <comment id="57645" author="laisiyao" created="Fri, 3 May 2013 15:57:24 +0000"  >&lt;p&gt;I&apos;m afraid this is not a lustre specific bug, because even lustre code forgets check duplicate dir entry before inserting, ldiskfs (ext4) will complain and refuse insertion. And I can&apos;t think of a way to catch such error in lustre code.&lt;/p&gt;

&lt;p&gt;BTW in the attached trace file, I see such log messages:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2013/01/18 15:16:56 rbh-sherpa@yack40[8625/11]: SHERPA | GC of dep files disabled: not removing /cea/cache_dep/rep1/rep2/rep3/rep4/HDep-n=Temps_u=s.v=f0000000000000000-v=f3fcc4db02b7a7355
2013/01/18 23:52:58 rbh-sherpa@yack40[8625/9]: SHERPA | GC of dep files disabled: not removing /cea/cache_dep/rep1/rep2/rep3/rep4/HDep-n=Temps_u=s.v=f0000000000000000-v=f3fcc4db02b7a7355
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Could you explain the meaning of this message?&lt;/p&gt;</comment>
                            <comment id="59649" author="kitwestneat" created="Thu, 30 May 2013 16:08:34 +0000"  >&lt;p&gt;Hi Lai,&lt;/p&gt;

&lt;p&gt;One of our customers recently hit this bug as well. Unfortunately it is a secure site, so getting information is difficult. I thought perhaps it could have been a problem with the journal getting incorrectly replayed, but they have not had any reboots. We thought that perhaps the file had been created with an unprintable character in the name, but that is not the case. Hexdumps of the names are identical. &lt;/p&gt;

&lt;p&gt;ls -i shows both files, but has the same inode number. I think this is because it does an lstat by name. We printed out the dirents and those showed both files with different inode numbers. &lt;/p&gt;

&lt;p&gt;The duplicate files still exist. Is there anything else we can look at before we delete the hidden one? I have asked for more details as to how the file is created, or if they can reproduce it, but I&apos;m not sure that it will be possible. &lt;/p&gt;

&lt;p&gt;Do you have any suggestions as to how to proceed? We are running Lustre 2.1.2.&lt;/p&gt;

&lt;p&gt;Thanks.&lt;/p&gt;</comment>
                            <comment id="60019" author="laisiyao" created="Wed, 5 Jun 2013 09:27:03 +0000"  >&lt;p&gt;Andreas, could you take a look and give some suggestion?&lt;/p&gt;</comment>
                            <comment id="60040" author="kitwestneat" created="Wed, 5 Jun 2013 16:51:18 +0000"  >&lt;p&gt;I was able to reproduce it on a single client (which was also the MDT) using a modified version of the reproducer in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3439&quot; title=&quot;User code creating multiple lockfiles with same name&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3439&quot;&gt;&lt;del&gt;LU-3439&lt;/del&gt;&lt;/a&gt;. I used the debug daemon to get full debugging information and have attached it to the ticket, along with the reproducer and debugfs stats of the two files with the same filename.&lt;/p&gt;

&lt;p&gt;Thanks.&lt;/p&gt;</comment>
                            <comment id="60042" author="kitwestneat" created="Wed, 5 Jun 2013 16:51:56 +0000"  >&lt;p&gt;err I can&apos;t attach it, it&apos;s 20MB, but here is a link:&lt;br/&gt;
&lt;a href=&quot;https://woscloud.corp.ddn.com/v2/files/L2YwMWU5NWI5ZGViNGRjOThiMmUxMWZmZGMyMzFiYjE0NQ==/content/inline/lu-2901.tgz&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://woscloud.corp.ddn.com/v2/files/L2YwMWU5NWI5ZGViNGRjOThiMmUxMWZmZGMyMzFiYjE0NQ==/content/inline/lu-2901.tgz&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="60105" author="kitwestneat" created="Thu, 6 Jun 2013 17:39:59 +0000"  >&lt;p&gt;Hi Lai,&lt;/p&gt;

&lt;p&gt;I think you meant to respond to this LU, so I am posting here. &lt;/p&gt;

&lt;p&gt;Here is what my directory looks like:&lt;br/&gt;
---------- 1 root root      5 Jun  5 09:31 test_file&lt;br/&gt;
---------- 1 root root      5 Jun  5 09:31 test_file&lt;/p&gt;

&lt;p&gt;with -i:&lt;br/&gt;
144115205322835997 ---------- 1 root root 5 Jun  5 09:31 test_file&lt;br/&gt;
144115205322835997 ---------- 1 root root 5 Jun  5 09:31 test_file&lt;/p&gt;

&lt;p&gt;Using debugfs on the MDT I can see that there is one with inode 45 and one with inode 25, the stat for them should both be in the tarball. &lt;/p&gt;

&lt;p&gt;To reproduce it I ran:&lt;br/&gt;
rm * -f ; for x in `seq 1 8`; do ~/test &amp;amp; done&lt;/p&gt;

&lt;p&gt;until I saw two files named test_file (or it ran out of files), and then I ran killall test.&lt;/p&gt;

&lt;p&gt;I wasn&apos;t sure if the files needed to be kept open to reproduce it. I just now reproduced it after adding close(fd) to the main loop, so it seems as if it&apos;s not necessary. &lt;/p&gt;

&lt;p&gt;I also modified the main loop to end after a certain number of iterations. I could reproduce it if the loop was run more than 1x, but not if it was only run once.  &lt;/p&gt;</comment>
                            <comment id="60138" author="adilger" created="Fri, 7 Jun 2013 02:08:25 +0000"  >&lt;p&gt;Attached is an updated and improved version of the test, named duplicate_name.c.&lt;/p&gt;

&lt;p&gt;It seems that the race is only in the first time this link tries to be created, and all of the other loops are useless.&lt;/p&gt;

&lt;p&gt;I suspect the problem is some kind of locking problem (maybe pdirops?) during &lt;tt&gt;link()&lt;/tt&gt; on the MDT not grabbing the right lock on the &quot;test_file&quot; name/hash properly.  The original file can be created due to its unique name, then &lt;tt&gt;link()&lt;/tt&gt; succeeds in linking to &quot;test_file&quot; in a few cases if there was a racy lookup on the client?&lt;/p&gt;</comment>
                            <comment id="60144" author="adilger" created="Fri, 7 Jun 2013 04:53:27 +0000"  >&lt;p&gt;Like Kit, I&apos;m running this like:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;rm $DIR/test_file*; for X in $(seq 8); do ./duplicate_name $DIR &amp;amp; done; wait; ls $DIR
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The updated test exits immediately if it was able to create the link, and the remaining runs also exit after 10 loops.  I&apos;m able to reproduce this 100% of the time on master, sometimes getting as many as 4 links with the same name.  It also works if the &lt;tt&gt;open(), write(), fchmod(), close()&lt;/tt&gt; calls are removed, leaving only the &lt;tt&gt;link()&lt;/tt&gt; call, and can be reproduced with only 2 threads (though not 100% of the time).&lt;/p&gt;

&lt;p&gt;I&apos;m attaching a full debug log from a two-thread run, which has also been patched to change LDLM_DEBUG() to print the lock resources in a format similar to DFID so they are easier to find in the logs and also include the the name hash.&lt;/p&gt;

&lt;p&gt;I suspect the problem to be in pdirops, or the MDT&apos;s usage of it in link, since normal ext4 wouldn&apos;t ever allow this I think.&lt;/p&gt;</comment>
                            <comment id="60145" author="adilger" created="Fri, 7 Jun 2013 04:55:09 +0000"  >&lt;p&gt;Full debug log with improved LDLM_DEBUG() messages.&lt;/p&gt;</comment>
                            <comment id="60164" author="laisiyao" created="Fri, 7 Jun 2013 12:49:53 +0000"  >&lt;p&gt;Liang has confirmed that pdirops is firstly included from 2.3, and this is can be reproduced on 2.1, pdirops is not the culprit.&lt;/p&gt;</comment>
                            <comment id="60204" author="adilger" created="Fri, 7 Jun 2013 23:24:17 +0000"  >&lt;p&gt;I think I have understood what is happening here.  The link handling in the MDT is fundamentally unsafe, because it is never checking whether the target name already exists or not. &lt;tt&gt;mdt_reint_link()&lt;/tt&gt; is correctly looking up and locking the parent object and name hash, looking up and locking the source object, and calling the &lt;tt&gt;mdd_link()&lt;del&gt;&amp;gt;...osd_index_ea_insert()&lt;/del&gt;&amp;gt;...ldiskfs_add_entry()&lt;/tt&gt; but in ldiskfs there is no guarantee that the name being inserted is unique.&lt;/p&gt;

&lt;p&gt;The ldiskfs code depends on the original caller (normally the VFS) to have done a lookup and return &lt;tt&gt;-EEXIST&lt;/tt&gt; in case of an existing target.  It happens that &lt;tt&gt;add_dirent_to_buf()&lt;/tt&gt; does return &lt;tt&gt;-EEXIST&lt;/tt&gt; if it happens to find the name in the leaf block it is traversing to find an empty slot for the dirent, but if the name is &lt;em&gt;after&lt;/em&gt; an empty slot large enough to hold the dirent this will not happen.  In the current test case, the original name of the source file is &lt;tt&gt;test_file_XXXXXX&lt;/tt&gt; and this longer name is unlinked by the first thread doing the &lt;tt&gt;link(&quot;test_name&quot;)&lt;/tt&gt; operation.  That leaves an empty slot in the directory leaf block large enough for the second thread to also insert &lt;tt&gt;&quot;test_name&quot;&lt;/tt&gt; into the directory.  The locking of the directory is not enough, since the name is not revalidated under the lock.&lt;/p&gt;

&lt;p&gt;Alex, I&apos;m not sure at which level the name to be inserted should be checked?  Is this something that should be handled at the MDD layer, or the OSD layer?  It looks like ZFS is already checking that the name being inserted into the directory ZAP is unique, and this is also common for e.g. databases to allow checking that the inserted key is unique, so it makes sense that this be handled inside osd-ldiskfs.  The &lt;tt&gt;add_dirent_to_buf()&lt;/tt&gt; code is already doing this for some of the cases, but I am reluctant to patch ext4 further in this area.  I&apos;m also aware that you don&apos;t like extra &quot;unnecessary&quot; checks at the OSD level for operations that should be verified at the higher layers, so please provide some input on where you think this should be checked.&lt;/p&gt;

&lt;p&gt;I&apos;ve pushed two patches for this bug.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/6591&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/6591&lt;/a&gt;&lt;br/&gt;
A new sanity.sh test_31o case which can reproduce the problem.  It is just a shell script, no need for a separate program.  It reproduces 100% on ldiskfs with 8 threads, and not in several hundred runs on ZFS, which bears out the above analysis.  It should be used as the basis for any fix, though the test case needs to be excluded when run against older server versions, like below (I didn&apos;t include this into the current test case so that it would run when developing the fix):&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; [ $(lustre_version_code $SINGLEMDS) -lt $(version_code 2.1.6) ] ||
           [ $(lustre_version_code $SINGLEMDS) -ge $(version_code 2.2.0) -a
             $(lustre_version_code $SINGLEMDS) -lt $(version_code 2.4.1) ]; then
                skip &lt;span class=&quot;code-quote&quot;&gt;&quot;Need MDS version at least 2.1.6 or 2.4.1&quot;&lt;/span&gt;
                &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; 0
        fi
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/6592&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/6592&lt;/a&gt;&lt;br/&gt;
The second patch cleans up the rest of the code to use the new DLDLMRES/PLDLMRES macros when printing out resource names, so that they appear in messages and logs in a consistent format.  This also exposed a bug in the ll_md_blocking_ast() callback, which has been around since &lt;a href=&quot;http://review.whamcloud.com/2271&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/2271&lt;/a&gt; landed (&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1198&quot; title=&quot;Change DLM lock encoding to put FID version into lock res[1]&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1198&quot;&gt;&lt;del&gt;LU-1198&lt;/del&gt;&lt;/a&gt;), but is &quot;not a bug yet&quot; because it only affects unused fields in the FID and DLM resource so I didn&apos;t split it into a separate commit.&lt;/p&gt;</comment>
                            <comment id="60214" author="laisiyao" created="Sat, 8 Jun 2013 11:17:43 +0000"  >&lt;p&gt;VFS actually checks target for link: sys_linkat -&amp;gt; lookup_create. But lustre client ll_lookup_nd(O_CREATE &amp;amp;&amp;amp; !O_OPEN) doesn&apos;t really lookup on server, but create a temp dentry and return, and let the real operation, eg. link/mknod later to check error like -EEXIST. But unfortunately MDS doesn&apos;t check this for link.&lt;/p&gt;</comment>
                            <comment id="60313" author="adilger" created="Tue, 11 Jun 2013 00:41:59 +0000"  >&lt;p&gt;Alex, I&apos;m not sure at which level the name to be inserted should be checked? Is this something that should be handled at the MDD layer, or the OSD layer? It looks like ZFS is already checking that the name being inserted into the directory ZAP is unique, and this is also common for e.g. databases to allow checking that the inserted key is unique, so it makes sense that this be handled inside osd-ldiskfs. The add_dirent_to_buf() code is already doing this for some of the cases, but I am reluctant to patch ext4 further in this area. I&apos;m also aware that you don&apos;t like extra &quot;unnecessary&quot; checks at the OSD level for operations that should be verified at the higher layers, so please provide some input on where you think this should be checked.&lt;/p&gt;</comment>
                            <comment id="60319" author="bzzz" created="Tue, 11 Jun 2013 04:47:03 +0000"  >&lt;p&gt;yes, I agree it&apos;s better to be done in OSD, especially given ldiskfs does this partially. we were going to introduce -&amp;gt;update() which would make -&amp;gt;insert() returning -EEXIST well justified?&lt;/p&gt;</comment>
                            <comment id="60421" author="bzzz" created="Wed, 12 Jun 2013 05:48:31 +0000"  >&lt;p&gt;hmm, if we just add yet another lookup in -&amp;gt;insert(), then mdd_create() will be doing two lookups and that should affect performance?&lt;/p&gt;</comment>
                            <comment id="60534" author="adilger" created="Thu, 13 Jun 2013 11:45:54 +0000"  >&lt;p&gt;Yes, I was wondering about this also.  At one point I thought it was just a dcache lookup, but it seems the dentry being used is totally fake.&lt;/p&gt;

&lt;p&gt;The most efficient way to do this would be to just scan the whole ldiskfs leaf block when doing the insert, since add_dirent_to_buf() is already scanning it looking for free space.  It is most likely going to scan most of the used part of the block anyway, so this would add relatively little overhead.  It would be good to structure this change so it is only done if being called by Lustre (e.g. if ldiskfs_dentry_param is passed).  That gives us some hope of having it accepted upstream.&lt;/p&gt;</comment>
                            <comment id="60535" author="bzzz" created="Thu, 13 Jun 2013 11:49:30 +0000"  >&lt;p&gt;probably we&apos;d have to scan more than 1 block if they share same hash value? should happen too rare to affect performance though.&lt;/p&gt;</comment>
                            <comment id="60562" author="laisiyao" created="Thu, 13 Jun 2013 15:59:25 +0000"  >&lt;p&gt;IMHO a one-line check before mdo_link() in mdt_reint_link() could fix this: check return value of mdt_lookup_version_check() for target is -ENOENT, otherwise -EEXIST is returned. (it&apos;s the same as mdt_md_create()).&lt;/p&gt;

&lt;p&gt;Now this check is only missing for link operation, if this check should only be done in ldiskfs, other operations should follow this rule too.&lt;/p&gt;</comment>
                            <comment id="60620" author="laisiyao" created="Fri, 14 Jun 2013 02:26:38 +0000"  >&lt;p&gt;I updated the patch of &lt;a href=&quot;http://review.whamcloud.com/#change,6591&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,6591&lt;/a&gt; with the way I described above.&lt;/p&gt;</comment>
                            <comment id="60781" author="bogl" created="Mon, 17 Jun 2013 23:41:36 +0000"  >&lt;p&gt;back port to b2_1:&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/6678&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/6678&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This back port leaves out the first 2 files from the original patch as advised by Oleg.&lt;/p&gt;</comment>
                            <comment id="60847" author="sebastien.buisson" created="Wed, 19 Jun 2013 07:29:28 +0000"  >&lt;p&gt;Hi,&lt;/p&gt;

&lt;p&gt;I think we need clarification regarding the various patches proposed in here. If I understand correctly, we have:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;&lt;a href=&quot;http://review.whamcloud.com/6591&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/6591&lt;/a&gt; : patch for master, adds a new test in sanity;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;http://review.whamcloud.com/6592&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/6592&lt;/a&gt; : patch for master, code cleanup;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;http://review.whamcloud.com/6678&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/6678&lt;/a&gt; : patch for b2_1, port of &lt;a href=&quot;http://review.whamcloud.com/6591&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/6591&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Does it mean that the patch that really fixes the issue is missing in b2_1?&lt;/p&gt;

&lt;p&gt;Thanks in advance,&lt;br/&gt;
Sebastien.&lt;/p&gt;</comment>
                            <comment id="60870" author="bogl" created="Wed, 19 Jun 2013 15:00:30 +0000"  >&lt;p&gt;Sebastian,&lt;br/&gt;
The 2nd patch in master, #6592, only changes log and error messages. It uses new macros that were defined in the master version of the first patch but were left out of the b2_1 back port.  This makes it hard to back port the 2nd patch.&lt;/p&gt;

&lt;p&gt;Only the first patch, #6591, contains actual functional fixes so that was all that was back ported.&lt;/p&gt;
</comment>
                            <comment id="60903" author="sebastien.buisson" created="Thu, 20 Jun 2013 06:26:14 +0000"  >&lt;p&gt;Thanks for the clarification, Bob!&lt;/p&gt;</comment>
                            <comment id="64173" author="bogl" created="Tue, 13 Aug 2013 14:14:57 +0000"  >&lt;p&gt;backports to b2_4:&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/7315&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7315&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/7316&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7316&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="64492" author="pjones" created="Tue, 20 Aug 2013 05:59:06 +0000"  >&lt;p&gt;Seems fixes have landed where needed&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="19291">LU-3439</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="13010" name="debug.gz" size="216661" author="adilger" created="Fri, 7 Jun 2013 04:55:09 +0000"/>
                            <attachment id="13009" name="duplicate_name.c" size="1489" author="adilger" created="Fri, 7 Jun 2013 02:08:25 +0000"/>
                            <attachment id="12276" name="fsck_issue_duplicate" size="16021" author="dmoreno" created="Mon, 4 Mar 2013 11:51:06 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvk53:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6988</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>