Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1353

mdt_reint_open() @@@ OPEN & CREAT not in open replay

Details

    • 3
    • 9748

    Description

      We occasionally see the message in the summary show up in the MDS console log during server recovery. What might cause this?

      Attachments

        Issue Links

          Activity

            [LU-1353] mdt_reint_open() @@@ OPEN & CREAT not in open replay
            adilger Andreas Dilger made changes -
            Resolution New: Cannot Reproduce [ 5 ]
            Status Original: In Progress [ 3 ] New: Resolved [ 5 ]

            Close old ticket.

            adilger Andreas Dilger added a comment - Close old ticket.
            pjones Peter Jones made changes -
            End date New: 27/Jun/14
            Start date New: 27/Apr/12
            morrone Christopher Morrone (Inactive) made changes -
            Labels New: llnl
            morrone Christopher Morrone (Inactive) made changes -
            Labels Original: sequoia
            prakash Prakash Surya (Inactive) made changes -
            Link New: This issue is related to LU-2574 [ LU-2574 ]
            laisiyao Lai Siyao added a comment -

            The code looks correct in all open, unlink handling, we need to find a simple way to reproduce this to collect more debug logs with 'vfstrace,inode' enabled on both MDS and clients.

            laisiyao Lai Siyao added a comment - The code looks correct in all open, unlink handling, we need to find a simple way to reproduce this to collect more debug logs with 'vfstrace,inode' enabled on both MDS and clients.
            pjones Peter Jones made changes -
            Labels Original: topsequoia New: sequoia

            Here you go:

            # grove513 /root > stat /p/lstest/.lustre/fid/[0x2000182dc:0x1f2d1:0x0]
            stat: cannot stat `/p/lstest/.lustre/fid/[0x2000182dc:0x1f2d1:0x0]': No such file or directory
            
            # grove-mds2 /mnt/snap > ls -Rlan PENDING/
            PENDING/:
            total 29
            drwxr-xr-x 2     0     0 2 Dec 31  1969 .
            drwxr-xr-x 2     0     0 2 Dec 31  1969 .
            drwxr-xr-x 4     0     0 2 Sep 28 15:39 ..
            drwx------ 2 37693 37693 2 Nov 28 18:21 0000000200018ee8:0000000f:00000000: 0
            
            PENDING/0000000200018ee8:0000000f:00000000: 0:
            total 16
            drwx------ 2 37693 37693 2 Nov 28 18:21 .
            drwx------ 2 37693 37693 2 Nov 28 18:21 .
            drwxr-xr-x 2     0     0 2 Dec 31  1969 ..
            

            Would it still be around a few days later?

            BTW, do you know how simul runs in during this MDS recovery? Is there any hint on whether these not found files been unlinked?

            I'm not sure off the top of my head..

            prakash Prakash Surya (Inactive) added a comment - Here you go: # grove513 /root > stat /p/lstest/.lustre/fid/[0x2000182dc:0x1f2d1:0x0] stat: cannot stat `/p/lstest/.lustre/fid/[0x2000182dc:0x1f2d1:0x0]': No such file or directory # grove-mds2 /mnt/snap > ls -Rlan PENDING/ PENDING/: total 29 drwxr-xr-x 2 0 0 2 Dec 31 1969 . drwxr-xr-x 2 0 0 2 Dec 31 1969 . drwxr-xr-x 4 0 0 2 Sep 28 15:39 .. drwx------ 2 37693 37693 2 Nov 28 18:21 0000000200018ee8:0000000f:00000000: 0 PENDING/0000000200018ee8:0000000f:00000000: 0: total 16 drwx------ 2 37693 37693 2 Nov 28 18:21 . drwx------ 2 37693 37693 2 Nov 28 18:21 . drwxr-xr-x 2 0 0 2 Dec 31 1969 .. Would it still be around a few days later? BTW, do you know how simul runs in during this MDS recovery? Is there any hint on whether these not found files been unlinked? I'm not sure off the top of my head..
            laisiyao Lai Siyao added a comment -

            Could you help verify the fid not found really doesn't exist on MDS?

            Firstly you can use `stat <mountpoint>/.lustre/fid/[0x2000182dc:0x1f2d1:0x0]` to verify whether fid exists.

            If this fid exists, then you can use `lfs fid2path <fsname> [0x2000182dc:0x1f2d1:0x0]` to check the path of this fid.

            Else you can mount MDS filesystem as ldiskfs, and check <mountpoint>/PENDING, is it empty?

            BTW, do you know how simul runs in during this MDS recovery? Is there any hint on whether these not found files been unlinked?

            laisiyao Lai Siyao added a comment - Could you help verify the fid not found really doesn't exist on MDS? Firstly you can use `stat <mountpoint>/.lustre/fid/ [0x2000182dc:0x1f2d1:0x0] ` to verify whether fid exists. If this fid exists, then you can use `lfs fid2path <fsname> [0x2000182dc:0x1f2d1:0x0] ` to check the path of this fid. Else you can mount MDS filesystem as ldiskfs, and check <mountpoint>/PENDING, is it empty? BTW, do you know how simul runs in during this MDS recovery? Is there any hint on whether these not found files been unlinked?

            People

              laisiyao Lai Siyao
              nedbass Ned Bass (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: