Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3696

sanity test_17m, test_17n: e2fsck unattached inodes failure

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.6.0, Lustre 2.5.4
    • Lustre 2.5.0, Lustre 2.6.0
    • 3
    • 9541

    Description

      I'm trying to find what test/environment/circumstances fills an OST during autotest. I ran sanity three times in a row on Toro; https://maloo.whamcloud.com/test_sessions/90d23e6c-fbe4-11e2-aaad-52540035b04c . I didn't hit the full OST problem, but I did run into sanity test 17m failures.

      On the second and, not surprisingly, third run of sanity, test 17m failed with:
      sanity test_17m: @@@@@@ FAIL: e2fsck should not report error upon short/long symlink MDT: rc=4

      This first, successful, run of test 17m has the following output:

      01:55:18:stop and checking mds1: e2fsck -fnvd /dev/lvm-MDS/P1
      01:55:18:CMD: client-24vm3 grep -c /mnt/mds1' ' /proc/mounts
      01:55:18:Stopping /mnt/mds1 (opts:-f) on client-24vm3
      01:55:18:CMD: client-24vm3 umount -d -f /mnt/mds1
      01:55:18:CMD: client-24vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      01:55:18:CMD: client-24vm3 e2fsck -fnvd /dev/lvm-MDS/P1
      01:55:18:client-24vm3: e2fsck 1.42.7.wc1 (12-Apr-2013)
      01:55:18:Pass 1: Checking inodes, blocks, and sizes
      01:55:18:Pass 2: Checking directory structure
      01:55:18:Pass 3: Checking directory connectivity
      01:55:18:Pass 4: Checking reference counts
      01:55:18:Pass 5: Checking group summary information
      01:55:18:
      01:55:18:        1324 inodes used (0.13%, out of 1048576)
      01:55:18:           7 non-contiguous files (0.5%)
      01:55:18:           1 non-contiguous directory (0.1%)
      01:55:18:             # of inodes with ind/dind/tind blocks: 2/0/0
      01:55:18:      154573 blocks used (29.48%, out of 524288)
      01:55:18:           0 bad blocks
      01:55:18:           1 large file
      01:55:18:
      01:55:18:         127 regular files
      01:55:18:         137 directories
      01:55:18:           0 character device files
      01:55:18:           0 block device files
      01:55:18:           0 fifos
      01:55:18:           0 links
      01:55:18:        1051 symbolic links (526 fast symbolic links)
      01:55:18:           0 sockets
      01:55:18:------------
      01:55:18:        1315 files
      

      This second run of test 17m has the following output:

      == sanity test 17m: run e2fsck against MDT which contains short/long symlink == 04:23:23 (1375442603)
      CMD: client-24vm3 /usr/sbin/lctl get_param -n version
      CMD: client-24vm3 /usr/sbin/lctl get_param -n version
      create 512 short and long symlink files under /mnt/lustre/d0.sanity/d17m
      erase them
      Waiting for local destroys to complete
      recreate the 512 symlink files with a shorter string
      stop and checking mds1: e2fsck -fnvd /dev/lvm-MDS/P1
      CMD: client-24vm3 grep -c /mnt/mds1' ' /proc/mounts
      Stopping /mnt/mds1 (opts:-f) on client-24vm3
      CMD: client-24vm3 umount -d -f /mnt/mds1
      CMD: client-24vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: client-24vm3 e2fsck -fnvd /dev/lvm-MDS/P1
      client-24vm3: e2fsck 1.42.7.wc1 (12-Apr-2013)
      client-24vm3: e2fsck_pass1:1500: increase inode 32773 badness 0 to 2
      Pass 1: Checking inodes, blocks, and sizes
      Pass 2: Checking directory structure
      Pass 3: Checking directory connectivity
      Pass 4: Checking reference counts
      Unattached inode 635
      Connect to /lost+found? no
      
      Unattached inode 636
      Connect to /lost+found? no
      
      Unattached inode 638
      Connect to /lost+found? no
      
      Unattached inode 639
      Connect to /lost+found? no
      
      Unattached inode 641
      Connect to /lost+found? no
      
      Unattached inode 645
      Connect to /lost+found? no
      
      Unattached inode 1841
      Connect to /lost+found? no
      
      Unattached inode 1842
      Connect to /lost+found? no
      
      Unattached inode 1843
      Connect to /lost+found? no
      
      Unattached inode 1844
      Connect to /lost+found? no
      
      Unattached inode 1845
      Connect to /lost+found? no
      
      Unattached inode 1846
      Connect to /lost+found? no
      
      Unattached inode 1847
      Connect to /lost+found? no
      
      Unattached inode 1848
      Connect to /lost+found? no
      
      Unattached inode 1849
      Connect to /lost+found? no
      
      Unattached inode 1850
      Connect to /lost+found? no
      
      Unattached inode 1851
      Connect to /lost+found? no
      
      Unattached inode 1852
      Connect to /lost+found? no
      
      Unattached inode 1855
      Connect to /lost+found? no
      
      Unattached inode 1894
      Connect to /lost+found? no
      
      Unattached inode 1895
      Connect to /lost+found? no
      
      Unattached inode 1896
      Connect to /lost+found? no
      
      Unattached inode 1897
      Connect to /lost+found? no
      
      Unattached inode 1898
      Connect to /lost+found? no
      
      Unattached inode 1899
      Connect to /lost+found? no
      
      Unattached inode 1900
      Connect to /lost+found? no
      
      Unattached inode 1901
      Connect to /lost+found? no
      
      Unattached inode 1902
      Connect to /lost+found? no
      
      Unattached inode 1903
      Connect to /lost+found? no
      
      Unattached inode 1904
      Connect to /lost+found? no
      
      Unattached inode 1905
      Connect to /lost+found? no
      
      Unattached inode 1908
      Connect to /lost+found? no
      
      Pass 5: Checking group summary information
      
      lustre-MDT0000: ********** WARNING: Filesystem still has errors **********
      
      
              1396 inodes used (0.13%, out of 1048576)
                40 non-contiguous files (2.9%)
                 2 non-contiguous directories (0.1%)
                   # of inodes with ind/dind/tind blocks: 18/2/0
            158490 blocks used (30.23%, out of 524288)
                 0 bad blocks
                 1 large file
      
               197 regular files
               139 directories
                 0 character device files
                 0 block device files
                 0 fifos
                 0 links
              1051 symbolic links (526 fast symbolic links)
                 0 sockets
      ------------
              1355 files
      

      Because the same test names were run in one test session, if looks like Maloo is confusing the output of one run with another and is a little confusing when looking at the logs. The time stamps also seem to be in the future of when the results were reported. Hopefully, I'm just misreading the logs and time stamps.

      Attachments

        Issue Links

          Activity

            [LU-3696] sanity test_17m, test_17n: e2fsck unattached inodes failure

            I posted a prototype patch at http://review.whamcloud.com/10179 that is at least an attempt at fixing this. I didn't test it, so it likely needs some help before it can land.

            adilger Andreas Dilger added a comment - I posted a prototype patch at http://review.whamcloud.com/10179 that is at least an attempt at fixing this. I didn't test it, so it likely needs some help before it can land.
            jamesanunez James Nunez (Inactive) added a comment - - edited

            I just checked the current master and, yes, I can still trigger this problem.

            Just running multiop to create a volatile file will create an unattached inode. For example, running the following from sanity test 185 will create an unattached inode:

            # ./multiop /lustre/scratch/vfile_test VFw4096c
            
            jamesanunez James Nunez (Inactive) added a comment - - edited I just checked the current master and, yes, I can still trigger this problem. Just running multiop to create a volatile file will create an unattached inode. For example, running the following from sanity test 185 will create an unattached inode: # ./multiop /lustre/scratch/vfile_test VFw4096c

            Ok Andreas, I agree this seems appropriate.
            But are you, or James able to reproduce with recent master ? I am asking because I just tried/checked as part of LU-4140 work but I am unable to get any orphan inode as I was able before and with migrate/release+restore operations ...

            bfaccini Bruno Faccini (Inactive) added a comment - Ok Andreas, I agree this seems appropriate. But are you, or James able to reproduce with recent master ? I am asking because I just tried/checked as part of LU-4140 work but I am unable to get any orphan inode as I was able before and with migrate/release+restore operations ...

            James, I don't think that the orphan inode problem is caused by the presence or absence of a changelog record for the volatile file. It seems to me that this problem could be fixed entirely independently of the changelog issue. I think this is particularly important since any file migration using "lfs migrate" is leaking space in the filesystem since 2.4.0 that can only be reclaimed by an offline e2fsck of the MDT to link the files into /lost+found and then manually mounting the filesystem to move files from /lost+found to e.g. /ROOT/lost+found or similar and deleting them.

            I suspect that there is some simple fix here like decrementing the nlink count on the volatile file after open so that it is properly cleaned up at close time. Then, the changelog issue in LU-4140 can be fixed independently.

            adilger Andreas Dilger added a comment - James, I don't think that the orphan inode problem is caused by the presence or absence of a changelog record for the volatile file. It seems to me that this problem could be fixed entirely independently of the changelog issue. I think this is particularly important since any file migration using " lfs migrate " is leaking space in the filesystem since 2.4.0 that can only be reclaimed by an offline e2fsck of the MDT to link the files into /lost+found and then manually mounting the filesystem to move files from /lost+found to e.g. /ROOT/lost+found or similar and deleting them. I suspect that there is some simple fix here like decrementing the nlink count on the volatile file after open so that it is properly cleaned up at close time. Then, the changelog issue in LU-4140 can be fixed independently.
            jamesanunez James Nunez (Inactive) added a comment - - edited

            Closing as duplicate of LU-4140. We can reopen this ticket is LU-4140 does not solve this problem.

            jamesanunez James Nunez (Inactive) added a comment - - edited Closing as duplicate of LU-4140 . We can reopen this ticket is LU-4140 does not solve this problem.

            What's common to all of these tests is that they create volatile files. This is related to or is a case of LU-4140.

            jamesanunez James Nunez (Inactive) added a comment - What's common to all of these tests is that they create volatile files. This is related to or is a case of LU-4140 .

            For sanity test 185 and 187b, the calls to multiop with the V option, open a volatile file, are causing the remaining unattached inodes.

            jamesanunez James Nunez (Inactive) added a comment - For sanity test 185 and 187b, the calls to multiop with the V option, open a volatile file, are causing the remaining unattached inodes.

            For sanity test 56x, I've confirmed that the migrate, call to "lfs migrate", from a two stripe file to a single stripe file creates one unattached inode. I suspect this is the same for test 56w since it calls lfs migrate on a file and directory.

            Also, migrating from a single striped file to a file with two strips creates an unattached inode also.

            jamesanunez James Nunez (Inactive) added a comment - For sanity test 56x, I've confirmed that the migrate, call to "lfs migrate", from a two stripe file to a single stripe file creates one unattached inode. I suspect this is the same for test 56w since it calls lfs migrate on a file and directory. Also, migrating from a single striped file to a file with two strips creates an unattached inode also.

            I started with a fresh file system and ran sanity. After sanity completed, there were 16 unattached inodes found by e2fsck. The offending tests are 56w, 56x, 185, and 187b.

            jamesanunez James Nunez (Inactive) added a comment - I started with a fresh file system and ran sanity. After sanity completed, there were 16 unattached inodes found by e2fsck. The offending tests are 56w, 56x, 185, and 187b.

            If you add something like:

            lctl get_param seq.*.fid
            

            after each test in run_one() you will know which test generated the FIDs for the orphan files. Just comparing the FID sequence numbers, the test_17m sequence is (0x2000061c0 - 0x2000000400) = 24000 higher than the starting sequence, while the orphan sequence is (0x200001b71 - 0x2000000400) = 6001 higher, and by linear extrapolation this puts it 25% of the way to test 17m, which is around test_5 (or test_4 which is the likely candidate).

            It is a bit confusing that the client would be getting a new sequence so often, since that should only happen every 128000 files or if the client is remounted, which I don't think should be the case? In any case, you could also try running:

            ONLY=4 sh sanity.sh
            

            on an already-mounted filesystem and then run e2fsck -f on the MDS to check if this test is causing the leak.

            adilger Andreas Dilger added a comment - If you add something like: lctl get_param seq.*.fid after each test in run_one() you will know which test generated the FIDs for the orphan files. Just comparing the FID sequence numbers, the test_17m sequence is (0x2000061c0 - 0x2000000400) = 24000 higher than the starting sequence, while the orphan sequence is (0x200001b71 - 0x2000000400) = 6001 higher, and by linear extrapolation this puts it 25% of the way to test 17m, which is around test_5 (or test_4 which is the likely candidate). It is a bit confusing that the client would be getting a new sequence so often, since that should only happen every 128000 files or if the client is remounted, which I don't think should be the case? In any case, you could also try running: ONLY=4 sh sanity.sh on an already-mounted filesystem and then run e2fsck -f on the MDS to check if this test is causing the leak.

            I looked at all the files in the Lustre file system and, on the client, got the FID for each file:

            # lfs path2fid /lustre/lscratch
            [0x200000007:0x1:0x0]
            
            # lfs path2fid /lustre/lscratch/d17m.sanitym
            [0x2000061c0:0xc09:0x0]
            
            # lfs path2fid /lustre/lscratch/d17m.sanitym/short-*
            /lustre/lscratch/d17m.sanitym/short-0: [0x2000061c0:0xe0b:0x0]
            /lustre/lscratch/d17m.sanitym/short-1: [0x2000061c0:0xe0d:0x0]
            /lustre/lscratch/d17m.sanitym/short-2: [0x2000061c0:0xe0f:0x0]
            …
            /lustre/lscratch/d17m.sanitym/short-510: [0x2000061c0:0x1207:0x0]
            /lustre/lscratch/d17m.sanitym/short-511: [0x2000061c0:0x1209:0x0]
            
            # lfs path2fid /lustre/lscratch/d17m.sanitym/long-* 
            /lustre/lscratch/d17m.sanitym/long-0: [0x2000061c0:0xe0a:0x0]
            /lustre/lscratch/d17m.sanitym/long-1: [0x2000061c0:0xe0c:0x0]
            /lustre/lscratch/d17m.sanitym/long-2: [0x2000061c0:0xe0e:0x0]
            …
            /lustre/lscratch/d17m.sanitym/long-510: [0x2000061c0:0x1206:0x0]
            /lustre/lscratch/d17m.sanitym/long-511: [0x2000061c0:0x1208:0x0]
            

            Those are all the files and directories in the file system.

            So the FIDs for the Unattached inodes found by running e2fsck do not belong to files in the file system. Is this the correct interpretation? If so, maybe those inodes are left over from the first run of sanity that didn't get cleaned up?

            jamesanunez James Nunez (Inactive) added a comment - I looked at all the files in the Lustre file system and, on the client, got the FID for each file: # lfs path2fid /lustre/lscratch [0x200000007:0x1:0x0] # lfs path2fid /lustre/lscratch/d17m.sanitym [0x2000061c0:0xc09:0x0] # lfs path2fid /lustre/lscratch/d17m.sanitym/short-* /lustre/lscratch/d17m.sanitym/short-0: [0x2000061c0:0xe0b:0x0] /lustre/lscratch/d17m.sanitym/short-1: [0x2000061c0:0xe0d:0x0] /lustre/lscratch/d17m.sanitym/short-2: [0x2000061c0:0xe0f:0x0] … /lustre/lscratch/d17m.sanitym/short-510: [0x2000061c0:0x1207:0x0] /lustre/lscratch/d17m.sanitym/short-511: [0x2000061c0:0x1209:0x0] # lfs path2fid /lustre/lscratch/d17m.sanitym/long-* /lustre/lscratch/d17m.sanitym/long-0: [0x2000061c0:0xe0a:0x0] /lustre/lscratch/d17m.sanitym/long-1: [0x2000061c0:0xe0c:0x0] /lustre/lscratch/d17m.sanitym/long-2: [0x2000061c0:0xe0e:0x0] … /lustre/lscratch/d17m.sanitym/long-510: [0x2000061c0:0x1206:0x0] /lustre/lscratch/d17m.sanitym/long-511: [0x2000061c0:0x1208:0x0] Those are all the files and directories in the file system. So the FIDs for the Unattached inodes found by running e2fsck do not belong to files in the file system. Is this the correct interpretation? If so, maybe those inodes are left over from the first run of sanity that didn't get cleaned up?

            People

              adilger Andreas Dilger
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: