Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13823

Two hard links to the same directory

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • None
    • kernel-3.10.0-1127.0.0.1chaos.ch6.x86_64
      zfs-0.7.11-9.4llnl.ch6.x86_64
      lustre-2.10.8_9.chaos-1.ch6.x86_64
    • 3
    • 9223372036854775807

    Description

      2 directories in the same filesystem have the same inode

      [root@rzslic5]==> ls -lid /p/czlustre2/reza2/5_star_pattern_J
      288233885643309090 drwx------ 3 58904 58904 33280 Jul 24 15:55 /p/czlustre2/reza2/5_star_pattern_J
      [root@rzslic5]==> ls -lid /p/czlustre2/reza2/5_star_pattern_J_2/
      288233885643309090 drwx------ 3 58904 58904 33280 Jul 24 15:55 /p/czlustre2/reza2/5_star_pattern_J_2/
      

      the same FID

      [root@rzslic2:reza2]# lfs path2fid 5_star_pattern_J
      [0x40003311e:0x22:0x0]
      [root@rzslic2:reza2]# lfs path2fid 5_star_pattern_J_2
      [0x40003311e:0x22:0x0] 

      The directory has one subdirectory:

      [root@oslic7:reza2]# ls -al 5_star_pattern_J
      total 130
      drwx------   3 pearce7 pearce7 33280 Jul 24 15:55 .
      drwx------ 155 reza2   reza2   57856 Jul 27 12:54 ..
      drwx------   2 pearce7 pearce7 41472 Sep 21  2019 0
      

      Attachments

        Issue Links

          Activity

            [LU-13823] Two hard links to the same directory
            adilger Andreas Dilger made changes -
            Resolution New: Cannot Reproduce [ 5 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            pjones Peter Jones made changes -
            Link Original: This issue is related to JFC-21 [ JFC-21 ]
            ofaaland Olaf Faaland made changes -
            Labels Original: llnl topllnl New: llnl
            ofaaland Olaf Faaland added a comment -

            Removing "topllnl" tag because I do not see any way for us to get more information about what happened. I'm going to leave it open in case we see the same problem again, or someone else does.

            ofaaland Olaf Faaland added a comment - Removing "topllnl" tag because I do not see any way for us to get more information about what happened. I'm going to leave it open in case we see the same problem again, or someone else does.
            ofaaland Olaf Faaland added a comment -

            Andreas,

            I just noticed that the bash_history contents I pasted in above are for the wrong file system (/p/czlustre3 AKA /p/lustre3, != /p/lustre2). It's typical that our users have quota and a directory on multiple Lustre file systems.

            So we have no context at all. I don't see what else we can do. If you can think of anything else we should look at, or a debug patch that would be helpful, let me know.

            thanks

            ofaaland Olaf Faaland added a comment - Andreas, I just noticed that the bash_history contents I pasted in above are for the wrong file system (/p/czlustre3 AKA /p/lustre3, != /p/lustre2). It's typical that our users have quota and a directory on multiple Lustre file systems. So we have no context at all. I don't see what else we can do. If you can think of anything else we should look at, or a debug patch that would be helpful, let me know. thanks

            Andreas,

            We weren't able to get any good information about what led up to this.  We believe it occurred during the "mv" operation below (this is bash history from a node the sysadmin was using)

            40 cd /p/czlustre3/pearce7
            41 ls
            42 cd reza2
            43 ls
            44 ls /p/czlustre3/reza2/
            45 pwd
            46 mv * /p/czlustre3/reza2/

            and that before the "mv" command

            • /p/czlustre3/reza2/ was empty, and
            • /p/czlustre3/pearce7/reza2/5_star_pattern_J was an apparently normal directory.

            During the "mv" command the sysadmin got an error message that 5_star_pattern_J already existed in the target. At that point he looked and saw that both the source and target directory had a subdirectory by that name, and then found they were two references to the same directory.

            It's hard to see how this sequence of events could create the problem, but that's unfortunately all we were able to find out.

            ofaaland Olaf Faaland added a comment - Andreas, We weren't able to get any good information about what led up to this.  We believe it occurred during the "mv" operation below (this is bash history from a node the sysadmin was using) 40 cd /p/czlustre3/pearce7 41 ls 42 cd reza2 43 ls 44 ls /p/czlustre3/reza2/ 45 pwd 46 mv * /p/czlustre3/reza2/ and that before the "mv" command /p/czlustre3/reza2/ was empty, and /p/czlustre3/pearce7/reza2/5_star_pattern_J was an apparently normal directory. During the "mv" command the sysadmin got an error message that 5_star_pattern_J already existed in the target. At that point he looked and saw that both the source and target directory had a subdirectory by that name, and then found they were two references to the same directory. It's hard to see how this sequence of events could create the problem, but that's unfortunately all we were able to find out.
            pjones Peter Jones made changes -
            Assignee Original: WC Triage [ wc-triage ] New: Andreas Dilger [ adilger ]

            Is it possible to ask the user how these two directories were created? Was there a rename, or they were possibly created in parallel? Was "lfs migrate -m" used on the directory to migrate between MDTs?

            I'm working on getting that information. There's been a complex chain of ownership so we're working through it.

            The "trusted.link" xattr shows only "5_star_pattern_J_2" for the name of the directory.

            The dnode shows "links 3", but that could be because of a subdirectory, and not necessarily because of multiple hard links to the file, but it would be useful to check. If the client had (somehow) allowed multiple hard links to the directory, it should also have added the filename to the "trusted.link" xattr at that time.

            There is one subdirectory, named "0". Sorry I left that out of the description.

            Have you tried creating hard links to a directory with ZFS? This should be caught by the client VFS, and also by ldiskfs, but I'm wondering if maybe ldiskfs implements such a check, but ZFS does this in the ZPL and that is not checked by osd-zfs?

            I haven't yet figured out where it's checked, but neither ZPL nor our lustre 2.10.8 backed by ZFS 0.7 allowed hard linking to a directory via link(3) when I tried it. In both cases link() failed and errno was set to EPERM, as you saw with your test. But there was nothing exciting going on while I tried that, like many processes in parallel, or a failover, etc.

            bash-4.2$ ll -d existing newlink
            ls: cannot access newlink: No such file or directory
            drwx------ 2 faaland1 faaland1 33280 Jul 27 16:50 existing
            
            bash-4.2$ strace -e link ./dolink existing newlink;
            link("existing", "newlink")             = -1 EPERM (Operation not permitted)
            +++ exited with 255 +++
            

            and this generates an RPC to the MDS, so
            it seems possible if some user binary was calling link(3) itself instead of ln(1) it might trigger this itself?

            Yes, maybe. I hope we're able to find out how these directories were created.

            ofaaland Olaf Faaland added a comment - Is it possible to ask the user how these two directories were created? Was there a rename, or they were possibly created in parallel? Was " lfs migrate -m " used on the directory to migrate between MDTs? I'm working on getting that information. There's been a complex chain of ownership so we're working through it. The " trusted.link " xattr shows only " 5_star_pattern_J_2 " for the name of the directory. The dnode shows " links 3 ", but that could be because of a subdirectory, and not necessarily because of multiple hard links to the file, but it would be useful to check. If the client had (somehow) allowed multiple hard links to the directory, it should also have added the filename to the " trusted.link " xattr at that time. There is one subdirectory, named "0". Sorry I left that out of the description. Have you tried creating hard links to a directory with ZFS? This should be caught by the client VFS, and also by ldiskfs, but I'm wondering if maybe ldiskfs implements such a check, but ZFS does this in the ZPL and that is not checked by osd-zfs ? I haven't yet figured out where it's checked, but neither ZPL nor our lustre 2.10.8 backed by ZFS 0.7 allowed hard linking to a directory via link(3) when I tried it. In both cases link() failed and errno was set to EPERM, as you saw with your test. But there was nothing exciting going on while I tried that, like many processes in parallel, or a failover, etc. bash-4.2$ ll -d existing newlink ls: cannot access newlink: No such file or directory drwx------ 2 faaland1 faaland1 33280 Jul 27 16:50 existing bash-4.2$ strace -e link ./dolink existing newlink; link("existing", "newlink") = -1 EPERM (Operation not permitted) +++ exited with 255 +++ and this generates an RPC to the MDS, so it seems possible if some user binary was calling link(3) itself instead of ln(1) it might trigger this itself? Yes, maybe. I hope we're able to find out how these directories were created.
            pjones Peter Jones made changes -
            Link New: This issue is related to JFC-21 [ JFC-21 ]
            ofaaland Olaf Faaland made changes -
            Description Original: 2 directories in the same filesystem have the same inode
            {noformat}
            [root@rzslic5]==> ls -lid /p/czlustre2/reza2/5_star_pattern_J
            288233885643309090 drwx------ 3 58904 58904 33280 Jul 24 15:55 /p/czlustre2/reza2/5_star_pattern_J
            [root@rzslic5]==> ls -lid /p/czlustre2/reza2/5_star_pattern_J_2/
            288233885643309090 drwx------ 3 58904 58904 33280 Jul 24 15:55 /p/czlustre2/reza2/5_star_pattern_J_2/
            {noformat}
            the same FID
            {noformat}
            [root@rzslic2:reza2]# lfs path2fid 5_star_pattern_J
            [0x40003311e:0x22:0x0]
            [root@rzslic2:reza2]# lfs path2fid 5_star_pattern_J_2
            [0x40003311e:0x22:0x0] {noformat}
            New: 2 directories in the same filesystem have the same inode
            {noformat}
            [root@rzslic5]==> ls -lid /p/czlustre2/reza2/5_star_pattern_J
            288233885643309090 drwx------ 3 58904 58904 33280 Jul 24 15:55 /p/czlustre2/reza2/5_star_pattern_J
            [root@rzslic5]==> ls -lid /p/czlustre2/reza2/5_star_pattern_J_2/
            288233885643309090 drwx------ 3 58904 58904 33280 Jul 24 15:55 /p/czlustre2/reza2/5_star_pattern_J_2/
            {noformat}
            the same FID
            {noformat}
            [root@rzslic2:reza2]# lfs path2fid 5_star_pattern_J
            [0x40003311e:0x22:0x0]
            [root@rzslic2:reza2]# lfs path2fid 5_star_pattern_J_2
            [0x40003311e:0x22:0x0] {noformat}

            The directory has one subdirectory:
            {noformat}
            [root@oslic7:reza2]# ls -al 5_star_pattern_J
            total 130
            drwx------ 3 pearce7 pearce7 33280 Jul 24 15:55 .
            drwx------ 155 reza2 reza2 57856 Jul 27 12:54 ..
            drwx------ 2 pearce7 pearce7 41472 Sep 21 2019 0
            {noformat}

            People

              adilger Andreas Dilger
              ofaaland Olaf Faaland
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: