Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1366

getting "dirdata length set incorrectly" running e2fsck

Details

    • Bug
    • Resolution: Won't Fix
    • Minor
    • Lustre 2.1.2
    • Lustre 2.1.1
    • DDN SFA10k - Dell R710 - TOSS2.0 OS release
    • 3
    • 4619

    Description

      After adding a network to the file system and adding the IP for the failover node to the MDS it wouldn't mount. (I later found that --param failnode= is no longer valid - much to my chagrin) I attempted to run fsck against the file system but it responded that the e2fsprogs was out of date for the file system so I ran fsck.ldiskfs. The fsck.ldiskfs found some bad inodes and corrected them but on a subsequent run with the -n option (done to make sure it was clean) I started seeing a flood of "dirdata length set incorrectly" messages. I stopped it and was able to mount the FS but later the FS spontaneously unmounted.

      What does this mean? Fortunately this file system is in pre-production and can be recreated (which is intended) but I'd like to know if this was caused by running fsck.ldiskfs since I did not see these messages on the first pass. The version of e2fsprogs (non-Redhat) is ldiskfsprogs-1.41.90.3chaos.wc3-0.ch5.x86_64. I have downloaded the wc4 version from the WC repo and installed it into a test image where I have rebooted the node into. I was able to use e2fsck to check the FS and I am using -fDy options but the "dirdata length set incorrectly" message continues to stream and has been going for more that an hour.

      Any help would be appreciated.

      Attachments

        Issue Links

          Activity

            [LU-1366] getting "dirdata length set incorrectly" running e2fsck

            Whoops, I needed an explicit "fetch --tags". Must have that remote configured wrong.

            morrone Christopher Morrone (Inactive) added a comment - Whoops, I needed an explicit "fetch --tags". Must have that remote configured wrong.

            The v1.42.3.wc1 tag is on the master-lustre branch.

            adilger Andreas Dilger added a comment - The v1.42.3.wc1 tag is on the master-lustre branch.

            I wee the v1.42.3-lustre branch, but not the 1.42.3.wc1 tag.

            morrone Christopher Morrone (Inactive) added a comment - I wee the v1.42.3-lustre branch, but not the 1.42.3.wc1 tag.

            The e2fsck fix for this is included into the rebased e2fsprogs-1.42.3.wc1 build, currently undergoing testing.

            adilger Andreas Dilger added a comment - The e2fsck fix for this is included into the rebased e2fsprogs-1.42.3.wc1 build, currently undergoing testing.

            The "Flags: 0x80000" line maps to EXT4_EXTENTS_FL, so in fact it seems this is being set/inherited incorrectly on the MDT fast symlinks. Note "Fast_link_dest: ../bin/passwd" indicates that the symlink is indeed stored inside the inode.

            My first guess is a defect in the osd-ldiskfs code that is unconditionally setting LDISKFS_EXTENTS_FL on all inodes, when this should only be set on regular files.

            adilger Andreas Dilger added a comment - The "Flags: 0x80000" line maps to EXT4_EXTENTS_FL, so in fact it seems this is being set/inherited incorrectly on the MDT fast symlinks. Note "Fast_link_dest: ../bin/passwd" indicates that the symlink is indeed stored inside the inode. My first guess is a defect in the osd-ldiskfs code that is unconditionally setting LDISKFS_EXTENTS_FL on all inodes, when this should only be set on regular files.
            jamervi Joe Mervini added a comment -

            I ran the test (mostly out of curiosity and for my own understanding). When I ran lsattr against a linked file I got operation not supported:

            root@cmds1 bin2]# lsattr /mnt/ROOT/jamervi/bin/passwd
            ------------e /mnt/ROOT/jamervi/bin/passwd
            [root@cmds1 bin2]# lsattr /mnt/ROOT/jamervi/bin2/passwd
            lsattr: Operation not supported While reading flags on /mnt/ROOT/jamervi/bin2/passwd

            But when I ran debugfs (and I really don't know how to interpret the output) it appears to me that there are not extents associated with the symlink. At least none are explicitly called out. Am I interpreting this correctly?

            [root@cmds1 bin2]# debugfs /dev/mapper/3600c0ff00011bdb4b12c0b4f01000000
            debugfs 1.41.12 (17-May-2010)
            debugfs: cd /mnt/ROOT
            /mnt/ROOT: File not found by ext2_lookup
            debugfs: cd /mnt
            /mnt: File not found by ext2_lookup
            debugfs: cd ROOT
            debugfs: cd jamervi
            debugfs: cd bin
            debugfs: stat passwd
            Inode: 405079025 Type: regular Mode: 0755 Flags: 0x80000
            Generation: 3906041213 Version: 0x00000001:000011dd
            User: 0 Group: 0 Size: 0
            File ACL: 0 Directory ACL: 0
            Links: 1 Blockcount: 0
            Fragment: Address: 0 Number: 0 Size: 0
            ctime: 0x4fb68353:00000000 – Fri May 18 11:13:55 2012
            atime: 0x4fb68353:00000000 – Fri May 18 11:13:55 2012
            mtime: 0x4fb68353:00000000 – Fri May 18 11:13:55 2012
            crtime: 0x4fb68353:ad6a06dc – Fri May 18 11:13:55 2012
            Size of extra inode fields: 28
            Extended attributes stored in inode body:
            lma = "00 00 00 00 00 00 00 00 00 04 00 00 02 00 00 00 f0 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0
            0 00 00 00 00 00 00 00 00 00 00 " (64)
            link = "df f1 ea 11 01 00 00 00 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 18 00 00 00 02 00 00 04 00 00 00 00 02 00 00 00 00 70 61 73 73 77 64 " (48)
            lov = "d0 0b d1 0b 01 00 00 00 f0 08 00 00 00 00 00 00 00 04 00 00 02 00 00 00 00 00 10 00 01 00 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2e 0
            0 00 00 " (56)
            EXTENTS:
            debugfs: cd ../bin2
            debugfs: stat passwd
            Inode: 405082862 Type: symlink Mode: 0777 Flags: 0x80000
            Generation: 3906045050 Version: 0x00000001:000026ee
            User: 0 Group: 0 Size: 13
            File ACL: 0 Directory ACL: 0
            Links: 1 Blockcount: 0
            Fragment: Address: 0 Number: 0 Size: 0
            ctime: 0x4fb68377:00000000 – Fri May 18 11:14:31 2012
            atime: 0x4fb683cb:1cb0d110 – Fri May 18 11:15:55 2012
            mtime: 0x4fb68377:00000000 – Fri May 18 11:14:31 2012
            crtime: 0x4fb68377:6c55b22c – Fri May 18 11:14:31 2012
            Size of extra inode fields: 28
            Extended attributes stored in inode body:
            lma = "00 00 00 00 00 00 00 00 00 04 00 00 02 00 00 00 ed 17 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0
            0 00 00 00 00 00 00 00 00 00 00 " (64)
            link = "df f1 ea 11 01 00 00 00 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 18 00 00 00 02 00 00 04 00 00 00 00 04 00 00 00 00 70 61 73 73 77 64 " (48)
            Fast_link_dest: ../bin/passwd

            jamervi Joe Mervini added a comment - I ran the test (mostly out of curiosity and for my own understanding). When I ran lsattr against a linked file I got operation not supported: root@cmds1 bin2]# lsattr /mnt/ROOT/jamervi/bin/passwd ------------ e /mnt/ROOT/jamervi/bin/passwd [root@cmds1 bin2] # lsattr /mnt/ROOT/jamervi/bin2/passwd lsattr: Operation not supported While reading flags on /mnt/ROOT/jamervi/bin2/passwd But when I ran debugfs (and I really don't know how to interpret the output) it appears to me that there are not extents associated with the symlink. At least none are explicitly called out. Am I interpreting this correctly? [root@cmds1 bin2] # debugfs /dev/mapper/3600c0ff00011bdb4b12c0b4f01000000 debugfs 1.41.12 (17-May-2010) debugfs: cd /mnt/ROOT /mnt/ROOT: File not found by ext2_lookup debugfs: cd /mnt /mnt: File not found by ext2_lookup debugfs: cd ROOT debugfs: cd jamervi debugfs: cd bin debugfs: stat passwd Inode: 405079025 Type: regular Mode: 0755 Flags: 0x80000 Generation: 3906041213 Version: 0x00000001:000011dd User: 0 Group: 0 Size: 0 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x4fb68353:00000000 – Fri May 18 11:13:55 2012 atime: 0x4fb68353:00000000 – Fri May 18 11:13:55 2012 mtime: 0x4fb68353:00000000 – Fri May 18 11:13:55 2012 crtime: 0x4fb68353:ad6a06dc – Fri May 18 11:13:55 2012 Size of extra inode fields: 28 Extended attributes stored in inode body: lma = "00 00 00 00 00 00 00 00 00 04 00 00 02 00 00 00 f0 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0 0 00 00 00 00 00 00 00 00 00 00 " (64) link = "df f1 ea 11 01 00 00 00 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 18 00 00 00 02 00 00 04 00 00 00 00 02 00 00 00 00 70 61 73 73 77 64 " (48) lov = "d0 0b d1 0b 01 00 00 00 f0 08 00 00 00 00 00 00 00 04 00 00 02 00 00 00 00 00 10 00 01 00 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2e 0 0 00 00 " (56) EXTENTS: debugfs: cd ../bin2 debugfs: stat passwd Inode: 405082862 Type: symlink Mode: 0777 Flags: 0x80000 Generation: 3906045050 Version: 0x00000001:000026ee User: 0 Group: 0 Size: 13 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x4fb68377:00000000 – Fri May 18 11:14:31 2012 atime: 0x4fb683cb:1cb0d110 – Fri May 18 11:15:55 2012 mtime: 0x4fb68377:00000000 – Fri May 18 11:14:31 2012 crtime: 0x4fb68377:6c55b22c – Fri May 18 11:14:31 2012 Size of extra inode fields: 28 Extended attributes stored in inode body: lma = "00 00 00 00 00 00 00 00 00 04 00 00 02 00 00 00 ed 17 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0 0 00 00 00 00 00 00 00 00 00 00 " (64) link = "df f1 ea 11 01 00 00 00 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 18 00 00 00 02 00 00 04 00 00 00 00 04 00 00 00 00 70 61 73 73 77 64 " (48) Fast_link_dest: ../bin/passwd

            Sorry, I wasn't really using my terms consistently. The fast symlinks are those stored directly in the inode, while slow symlinks are stored in an external block. These correspond to short and long symlinks (the boundary being at 60 bytes).

            I think the issue may be that if the symlink is stored in the inode (fast symlink) but the EXTENTS flag is set, that this may incorrectly be interpreting the symlink text as extent data, and e2fsck considers this a corrupt inode.

            To test this theory, an MDT filesystem with extents enabled should get some symlinks created, then mounted as ldiskfs and lsattr run on the symlinks to see if the extent flag is set. Alternately, debugfs "stat" can be used ok the inodes to print the flags.

            adilger Andreas Dilger added a comment - Sorry, I wasn't really using my terms consistently. The fast symlinks are those stored directly in the inode, while slow symlinks are stored in an external block. These correspond to short and long symlinks (the boundary being at 60 bytes). I think the issue may be that if the symlink is stored in the inode (fast symlink) but the EXTENTS flag is set, that this may incorrectly be interpreting the symlink text as extent data, and e2fsck considers this a corrupt inode. To test this theory, an MDT filesystem with extents enabled should get some symlinks created, then mounted as ldiskfs and lsattr run on the symlinks to see if the extent flag is set. Alternately, debugfs "stat" can be used ok the inodes to print the flags.
            jamervi Joe Mervini added a comment -

            Not to detour from the subject of this ticket, but could you explain the difference between fast, short and long symlinks? I wanted to keep my ignorance on the down-low by checking the web and with several people here, but no one seems to know.

            jamervi Joe Mervini added a comment - Not to detour from the subject of this ticket, but could you explain the difference between fast, short and long symlinks? I wanted to keep my ignorance on the down-low by checking the web and with several people here, but no one seems to know.

            So that explains why the "extent" option was set for the MDT filesystem. That said, with the patch in http://review.whamcloud.com/2798 it will explicitly unset the extents feature for the MDT filesystem to avoid this problem for new filesystems.

            We still need to understand/address the extents symlink problem. I see commits related to symlinks with extents (below), but it isn't clear whether the problem only applies to short symlinks, or long symlinks as well? Given that there are reports of many symlinks being deleted, I would suspect that the problem is with fast symlinks, and somehow the MDT is setting the "EXTENTS_FL" for symlinks, when it shouldn't be doing that.

            Author: Theodore Ts'o <tytso@mit.edu>
            Date:   Thu Mar 13 23:13:18 2008 -0400
            
                e2fsck: Check for fast symlinks that have EXTENTS_FL set
                
                These shouldn't show up in the wild, but if they do, e2fsck will offer
                to clear them.
                
                Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
            
            commit 7cadc57780f3e3e8e644e8976e11a336902d4a25
            Author: Theodore Ts'o <tytso@mit.edu>
            Date:   Thu Mar 13 23:05:00 2008 -0400
            
                e2fsck: Support long symlinks which use extents
                
                Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
            
            adilger Andreas Dilger added a comment - So that explains why the "extent" option was set for the MDT filesystem. That said, with the patch in http://review.whamcloud.com/2798 it will explicitly unset the extents feature for the MDT filesystem to avoid this problem for new filesystems. We still need to understand/address the extents symlink problem. I see commits related to symlinks with extents (below), but it isn't clear whether the problem only applies to short symlinks, or long symlinks as well? Given that there are reports of many symlinks being deleted, I would suspect that the problem is with fast symlinks, and somehow the MDT is setting the "EXTENTS_FL" for symlinks, when it shouldn't be doing that. Author: Theodore Ts'o <tytso@mit.edu> Date: Thu Mar 13 23:13:18 2008 -0400 e2fsck: Check for fast symlinks that have EXTENTS_FL set These shouldn't show up in the wild, but if they do, e2fsck will offer to clear them. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> commit 7cadc57780f3e3e8e644e8976e11a336902d4a25 Author: Theodore Ts'o <tytso@mit.edu> Date: Thu Mar 13 23:05:00 2008 -0400 e2fsck: Support long symlinks which use extents Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

            Ned pointed out to me that we are adding an "/etc/mkfs.ldiskfs.conf" file. Here is an excerpt:

            [fs_types]
                   ext3 = {
                           features = has_journal
                   }
                   ldiskfs = {
                           features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize
                           auto_64-bit_support = 1
                           inode_size = 256
                   }
            
            morrone Christopher Morrone (Inactive) added a comment - Ned pointed out to me that we are adding an "/etc/mkfs.ldiskfs.conf" file. Here is an excerpt: [fs_types] ext3 = { features = has_journal } ldiskfs = { features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize auto_64-bit_support = 1 inode_size = 256 }

            Thinking about this further, I think I understand the root cause. The standard mkfs_lustre.c will call "mke2fs {lots of options}", which starts with an ext2 filesystem and enables the individual features needed to make the filesystem ext4. For the MDT filesystem, it does not turn on the "extents" feature, but it does for the OST.

            In the TOSS ldiskfsprogs, I suspect that "mkfs.ldiskfs" starts with an "ext4" filesystem, and (re)sets the same options, but for the MDT it already has "extents" enabled.

            I don't think that we are modifying mkfs.lustre. We just configure lustre "--with-ldiskfsprogs", but that code is entirely in the upstream lustre.

            The ldiskfsprogs's mkfs.ldiskfs does not intentionally change the default filesystem type from ext2 to ext4. The patch that introduces the ldiskfsprogs changes is here:

            http://review.whamcloud.com/2582

            morrone Christopher Morrone (Inactive) added a comment - - edited Thinking about this further, I think I understand the root cause. The standard mkfs_lustre.c will call "mke2fs {lots of options}", which starts with an ext2 filesystem and enables the individual features needed to make the filesystem ext4. For the MDT filesystem, it does not turn on the "extents" feature, but it does for the OST. In the TOSS ldiskfsprogs, I suspect that "mkfs.ldiskfs" starts with an "ext4" filesystem, and (re)sets the same options, but for the MDT it already has "extents" enabled. I don't think that we are modifying mkfs.lustre. We just configure lustre "--with-ldiskfsprogs", but that code is entirely in the upstream lustre. The ldiskfsprogs's mkfs.ldiskfs does not intentionally change the default filesystem type from ext2 to ext4. The patch that introduces the ldiskfsprogs changes is here: http://review.whamcloud.com/2582

            People

              bobijam Zhenyu Xu
              jamervi Joe Mervini
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: