Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.2.0, Lustre 2.3.0, Lustre 2.1.1, Lustre 2.1.2, Lustre 2.1.3
    • None
    • 3
    • 4418

    Description

      Each time we run fsck on a MDS, we lose thousand of symlinks (which is a kind of corruption).
      To figure out a little what was wrong with those inodes, I did a small instrumentation of e2fsck_pass1_check_symlink and started to print various information about the inode (modification are attached).

      Here is what I got :

      [root@gaia14 ~]# ./e2fsck -n -f -v -d /dev/mapper/vg_mdt_work1-mdt_work1 | grep -B2 -A1 "at pass1.c:246"
      e2fsck 1.41.90.wc4 (01-Sep-2011)
      blocks != 0 && fs->blocksize = 4096, buf = %/home/cont001/segura/BIN/ELSA/CHAINE_V2/MODULES_PYTHON/GENERIQUES/Block.py_oldaux.py%
      len = 84, inode->i_size = 78
      at pass1.c:246 : offending inode 16819164 found !
      e2fsck_pass1:1416: increase inode 16819164 badness 0 to 1
      

      note the content of buf (result of io_channel_read_blk64), and specially the latest characters 'aux.py' (% was in the printf format).

      Looking at this inode using debugfs return

      debugfs: stat <16819164>
      Inode: 16819164 Type: symlink Mode: 0777 Flag
      Generation: 1029089099 Version: 0x0000004e:1a78c70
      User: 11876 Group: 1850 Size: 78
      File ACL: 0 Directory ACL: 0
      Links: 1 Blockcount: 8
      Fragment: Address: 0 Number: 0 Size: 0
       ctime: 0x4fd1a998:00000000 -- Fri Jun 8 09:28:24 20
       atime: 0x4fd1a998:00000000 -- Fri Jun 8 09:28:24 20
       mtime: 0x4fd1a998:00000000 -- Fri Jun 8 09:28:24 20
      crtime: 0x4fd1a998:e6373924 -- Fri Jun 8 09:28:24 20
      Size of extra inode fields: 28
      Extended attributes stored in inode body:
        lma = "00 00 00 00 00 00 00 00 56 fa 1b 0d 02 00 00
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        link = "df f1 ea 11 01 00 00 00 36 00 00 00 00 00 0
       5c 00 00 00 00 42 6c 6f 63 6b 2e 70 79 5f 6f 6c 64 "
      BLOCKS:
      (0):17832933
      TOTAL: 1
      
      debugfs: cat <16819164>
      /home/cont001/segura/BIN/ELSA/CHAINE_V2/MODULES_PYTHON/GENERIQUES/Block.py_old
      

      The string is the same except that it doesn't display the 'aux.py', and the targeted file is what was expected.

      Accessing the filesystem through mount doesn't display any issue. Accessing the symlink works and the content is valid (the expected link has a length of inode->i_size).
      The buffer retrieved in e2fsck looks wrong and the end of the string (starting inode->i_size) always contain garbage.

      So who is wrong ? Is it correct for e2fsprog to guess that a symlink is always terminated by a '\0', giving strnlen a chance to return the right length ? Or is ldiskfs wrong in not enforcing the '\0' at inode->i_size position ? ... or something else ...

      Attachments

        1. gen.sh
          1 kB
        2. log.txt.gz
          6 kB
        3. src.txt
          3 kB

        Issue Links

          Activity

            [LU-1540] e2fsck remove too many symlinks
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-16060 [ LU-16060 ]
            jlevi Jodi Levi (Inactive) made changes -
            Fix Version/s New: Lustre 2.4.0 [ 10154 ]
            adilger Andreas Dilger made changes -
            Affects Version/s New: Lustre 2.1.2 [ 10111 ]
            bobijam Zhenyu Xu made changes -
            Fix Version/s New: Lustre 2.1.3 [ 10141 ]
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-1774 [ LU-1774 ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.3.0 [ 10117 ]
            Priority Original: Blocker [ 1 ] New: Major [ 3 ]
            adilger Andreas Dilger made changes -
            Affects Version/s New: Lustre 2.2.0 [ 10082 ]
            Affects Version/s New: Lustre 2.3.0 [ 10117 ]
            Affects Version/s New: Lustre 2.1.3 [ 10141 ]
            Priority Original: Major [ 3 ] New: Blocker [ 1 ]
            louveta Alexandre Louvet (Inactive) made changes -
            Attachment New: gen.sh [ 11643 ]
            Attachment New: log.txt.gz [ 11644 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-1366 [ LU-1366 ]
            pjones Peter Jones made changes -
            Assignee Original: WC Triage [ wc-triage ] New: Zhenyu Xu [ bobijam ]
            louveta Alexandre Louvet (Inactive) created issue -

            People

              bobijam Zhenyu Xu
              louveta Alexandre Louvet (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: