Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1540

e2fsck remove too many symlinks

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.2.0, Lustre 2.3.0, Lustre 2.1.1, Lustre 2.1.2, Lustre 2.1.3
    • None
    • 3
    • 4418

    Description

      Each time we run fsck on a MDS, we lose thousand of symlinks (which is a kind of corruption).
      To figure out a little what was wrong with those inodes, I did a small instrumentation of e2fsck_pass1_check_symlink and started to print various information about the inode (modification are attached).

      Here is what I got :

      [root@gaia14 ~]# ./e2fsck -n -f -v -d /dev/mapper/vg_mdt_work1-mdt_work1 | grep -B2 -A1 "at pass1.c:246"
      e2fsck 1.41.90.wc4 (01-Sep-2011)
      blocks != 0 && fs->blocksize = 4096, buf = %/home/cont001/segura/BIN/ELSA/CHAINE_V2/MODULES_PYTHON/GENERIQUES/Block.py_oldaux.py%
      len = 84, inode->i_size = 78
      at pass1.c:246 : offending inode 16819164 found !
      e2fsck_pass1:1416: increase inode 16819164 badness 0 to 1
      

      note the content of buf (result of io_channel_read_blk64), and specially the latest characters 'aux.py' (% was in the printf format).

      Looking at this inode using debugfs return

      debugfs: stat <16819164>
      Inode: 16819164 Type: symlink Mode: 0777 Flag
      Generation: 1029089099 Version: 0x0000004e:1a78c70
      User: 11876 Group: 1850 Size: 78
      File ACL: 0 Directory ACL: 0
      Links: 1 Blockcount: 8
      Fragment: Address: 0 Number: 0 Size: 0
       ctime: 0x4fd1a998:00000000 -- Fri Jun 8 09:28:24 20
       atime: 0x4fd1a998:00000000 -- Fri Jun 8 09:28:24 20
       mtime: 0x4fd1a998:00000000 -- Fri Jun 8 09:28:24 20
      crtime: 0x4fd1a998:e6373924 -- Fri Jun 8 09:28:24 20
      Size of extra inode fields: 28
      Extended attributes stored in inode body:
        lma = "00 00 00 00 00 00 00 00 56 fa 1b 0d 02 00 00
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        link = "df f1 ea 11 01 00 00 00 36 00 00 00 00 00 0
       5c 00 00 00 00 42 6c 6f 63 6b 2e 70 79 5f 6f 6c 64 "
      BLOCKS:
      (0):17832933
      TOTAL: 1
      
      debugfs: cat <16819164>
      /home/cont001/segura/BIN/ELSA/CHAINE_V2/MODULES_PYTHON/GENERIQUES/Block.py_old
      

      The string is the same except that it doesn't display the 'aux.py', and the targeted file is what was expected.

      Accessing the filesystem through mount doesn't display any issue. Accessing the symlink works and the content is valid (the expected link has a length of inode->i_size).
      The buffer retrieved in e2fsck looks wrong and the end of the string (starting inode->i_size) always contain garbage.

      So who is wrong ? Is it correct for e2fsprog to guess that a symlink is always terminated by a '\0', giving strnlen a chance to return the right length ? Or is ldiskfs wrong in not enforcing the '\0' at inode->i_size position ? ... or something else ...

      Attachments

        1. gen.sh
          1 kB
        2. log.txt.gz
          6 kB
        3. src.txt
          3 kB

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              louveta Alexandre Louvet (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: