Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7325

wrong integer type used for inode number in LargeEA patch

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.8.0
    • Lustre 2.7.0, Lustre 2.8.0, Lustre 2.5.4
    • None
    • MDT with inode count > 2G
    • 3
    • 9223372036854775807

    Description

      Hitting "LDISKFS-fs error (device md66): ldiskfs_xattr_inode_iget: error while reading EA inode -2147483347" on large MDT volumes with large_xattr feature enabled.

      Recreation:

      1. MDS should have more than 2G inodes
      2. mdt fs should be created with large_xattr flag.
      3. set inode_goal to get higher inode number allocated.
      # echo 2147483947 > /sys/fs/ldiskfs/md66/inode_goal
      3. create a file
      4. start adding hard links to that file

      eventually LdiskFS fails with:

      [16767.564754] LDISKFS-fs (md66): mounted filesystem with ordered data mode. quota=on. Opts: 
      [16773.660132] Lustre: 25226:0:(client.c:1934:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1445001285/real 1445001285]  req@ffff8810232b3080 x1515193580388392/
      t0(0) o8->pinkfs-OST0001-osc-MDT0000@172.18.56.133@o2ib:28/4 lens 400/544 e 0 to 1 dl 1445001290 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      [17278.157672] LDISKFS-fs error (device md66): ldiskfs_xattr_inode_iget: error while reading EA inode -2147483446
      [17278.169053] Aborting journal on device md66-8.
      [17278.187860] LDISKFS-fs (md66): Remounting filesystem read-only
      [17278.203840] LustreError: 28195:0:(osd_io.c:1690:osd_ldiskfs_write_record()) journal_get_write_access() returned error -30
      [17278.216264] LustreError: 28195:0:(osd_handler.c:1105:osd_trans_stop()) Failure in transaction hook: -30
      [17278.226941] LustreError: 28195:0:(osd_handler.c:1114:osd_trans_stop()) Failure to stop transaction: -30
      [17278.227001] LustreError: 28024:0:(osd_handler.c:914:osd_trans_commit_cb()) transaction @0xffff8807d0b97d80 commit error: 2
      [17280.394367] LDISKFS-fs warning (device md66): kmmpd: kmmpd being stopped since filesystem has been remounted as readonly.
      

      Attachments

        Issue Links

          Activity

            [LU-7325] wrong integer type used for inode number in LargeEA patch

            Landed for 2.8

            jgmitter Joseph Gmitter (Inactive) added a comment - Landed for 2.8

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16913/
            Subject: LU-7325 ldiskfs: use correct types for inode num
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: d630e9f288e96d50791c4d0fd4414337c673d9ea

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16913/ Subject: LU-7325 ldiskfs: use correct types for inode num Project: fs/lustre-release Branch: master Current Patch Set: Commit: d630e9f288e96d50791c4d0fd4414337c673d9ea
            zam Alexander Zarochentsev added a comment - patch http://review.whamcloud.com/16913

            Alexander Zarochentsev (alexander_zarochentsev@xyratex.com) uploaded a new patch: http://review.whamcloud.com/16913
            Subject: LU-7325 ldiskfs: use correct types for inode num
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 3b0057136e7883d507845cef2a5c12732175fb78

            gerrit Gerrit Updater added a comment - Alexander Zarochentsev (alexander_zarochentsev@xyratex.com) uploaded a new patch: http://review.whamcloud.com/16913 Subject: LU-7325 ldiskfs: use correct types for inode num Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 3b0057136e7883d507845cef2a5c12732175fb78

            Looks like this bug exists in several places: ext4_xattr_inode_get(), ext4_xattr_inode_iget(), ext4_xattr_inode_unlink(), ext4_xattr_inode_set(), and ext4_xattr_set_entry() (where it declares ea_ino before calling ext4_xattr_inode_set()).

            It isn't clear whether the right type is __u32 or unsigned long. All the on-disk ext4 structures use __u32 for inode numbers, but the VFS API uses unsigned long and ext4 might conceivably use a 64-bit value in the future. I would accept either one. There is also the xia_inodes[] array using "unsigned int" instead of __u32 or "unsigned long". That itself isn't a problem today, but it may as well be made consistent with the rest of the code, as I don't think the memory usage is a concern (at most the list of large xattr inodes for a single inode, typically only the difference between a 32-byte and a 64-byte allocation).

            We've made this a 2.8.0 blocker, so if it speeds things up you could just edit the ext4-large-eas.patch files in place to use __u32 or unsigned long for ea_ino (which is thankfully consistently named throughout the patch), and push the patch to Gerrit for build testing, rather than trying to round up all the different kernels to regenerate the patches. It appears that these functions are not referenced in other patches (even as context) so there is very little chance of this causing a compile failure, and you would find this out in an hour rather than spending a lot of time trying to track down and build each kernel.

            adilger Andreas Dilger added a comment - Looks like this bug exists in several places: ext4_xattr_inode_get() , ext4_xattr_inode_iget() , ext4_xattr_inode_unlink() , ext4_xattr_inode_set() , and ext4_xattr_set_entry() (where it declares ea_ino before calling ext4_xattr_inode_set()). It isn't clear whether the right type is __u32 or unsigned long. All the on-disk ext4 structures use __u32 for inode numbers, but the VFS API uses unsigned long and ext4 might conceivably use a 64-bit value in the future. I would accept either one. There is also the xia_inodes[] array using "unsigned int" instead of __u32 or "unsigned long". That itself isn't a problem today, but it may as well be made consistent with the rest of the code, as I don't think the memory usage is a concern (at most the list of large xattr inodes for a single inode, typically only the difference between a 32-byte and a 64-byte allocation). We've made this a 2.8.0 blocker, so if it speeds things up you could just edit the ext4-large-eas.patch files in place to use __u32 or unsigned long for ea_ino (which is thankfully consistently named throughout the patch), and push the patch to Gerrit for build testing, rather than trying to round up all the different kernels to regenerate the patches. It appears that these functions are not referenced in other patches (even as context) so there is very little chance of this causing a compile failure, and you would find this out in an hour rather than spending a lot of time trying to track down and build each kernel.

            The culprit was identified as a wrong integer type used in ext4-large-eas.patch :

            +struct inode *ext4_xattr_inode_iget(struct inode *parent, int ea_ino, int *err)
            +{
            +       struct inode *ea_inode = NULL;
            +
            +       ea_inode = ext4_iget(parent->i_sb, ea_ino);
            

            ext4_iget() is called with signed integer while ext4_iget() accepts unsigned long inode number.
            Type conversion takes place and the sign bit of singed int gets extended to high 32 bits of unsigned long integer.

            we have a patch but it needs some time to port to all Lustre 2.8 platforms.

            zam Alexander Zarochentsev added a comment - The culprit was identified as a wrong integer type used in ext4-large-eas.patch : +struct inode *ext4_xattr_inode_iget(struct inode *parent, int ea_ino, int *err) +{ + struct inode *ea_inode = NULL; + + ea_inode = ext4_iget(parent->i_sb, ea_ino); ext4_iget() is called with signed integer while ext4_iget() accepts unsigned long inode number. Type conversion takes place and the sign bit of singed int gets extended to high 32 bits of unsigned long integer. we have a patch but it needs some time to port to all Lustre 2.8 platforms.

            People

              adilger Andreas Dilger
              zam Alexander Zarochentsev
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: