Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-216

Fix fiemap locking issue in 2.6.32 kernels

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.2.0, Lustre 1.8.6
    • Lustre 2.1.0, Lustre 1.8.6
    • None
    • 3
    • 23,780
    • 5009

    Description

      The patch from Lustre Bugzilla 23780 attachment 31890 was added for RHEL6 support with git commit e550abd05cf7ffddaedbef996be1baaae0912b4a, but a better long term fix is needed.

      Johann L. writes in Lustre Bugzilla 23780 comment #28:
      > The drawback of this patch is a potential race with fiemap (i did not revert
      > the git kernel commit fab3a549e204172236779f502eccb4f9bf0dc87d totally).

      and follows up in Lustre Bugzilla 23780 comment #34:
      > Please note that this patch is a short term solution.
      > Girish, it would be great if you could come up with another patch which takes
      > i_data_sem write reference where it is needed in ext4_walk_space().

      b1_8 also still requires the fix for SLES 11 SP1.


      Info required for matching: sanity-benchmark test_fsx fsx

      Attachments

        Activity

          [LU-216] Fix fiemap locking issue in 2.6.32 kernels
          bobijam Zhenyu Xu added a comment - - edited

          Panda, I think you are right, I also couldn't find where the extent tree got changed w/o i_mutex hold, but the in the 1st place, in walk_space, ext4_ext_find_extent is called with i_data_sem protection, then there should exist case where extent got change w/o i_mutex protection, so the issue should be what Bzzz mentioned, we'd protect extent access, i.e. put ldiskfs_ext_next_allocated_block() under i_data_sem lock hold.

          bobijam Zhenyu Xu added a comment - - edited Panda, I think you are right, I also couldn't find where the extent tree got changed w/o i_mutex hold, but the in the 1st place, in walk_space, ext4_ext_find_extent is called with i_data_sem protection, then there should exist case where extent got change w/o i_mutex protection, so the issue should be what Bzzz mentioned, we'd protect extent access, i.e. put ldiskfs_ext_next_allocated_block() under i_data_sem lock hold.
          panda Andrew Perepechko added a comment - - edited

          Zhenyu Xu, I was referring to your patch.

          > static int ext3_ext_new_extent_cb()
          >...
          > if (bp->create == 0)

          { >... > return EXT_CONTINUE; > }

          >
          > down_write((&EXT4_I(inode)->i_data_sem));
          >
          > /* validate extent, make sure the extent tree does not changed */
          > tmppath = ext4_ext_find_extent(inode, cex->ec_block, NULL);
          > if (IS_ERR(tmppath))

          { >... > }
          > tmpex = tmppath[depth].p_ext;
          > if (tmpex != ex) {>...> }

          This patch seems to be expecting that the extent map may have changed during write (which I called allocation race earlier) . However, obdfilter takes i_mutex on writes and truncates and serializes extent map updates with it.

          Which case/race is expected to be handled with your code that I quoted above?

          Thanks.

          panda Andrew Perepechko added a comment - - edited Zhenyu Xu, I was referring to your patch. > static int ext3_ext_new_extent_cb() >... > if (bp->create == 0) { >... > return EXT_CONTINUE; > } > > down_write((&EXT4_I(inode)->i_data_sem)); > > /* validate extent, make sure the extent tree does not changed */ > tmppath = ext4_ext_find_extent(inode, cex->ec_block, NULL); > if (IS_ERR(tmppath)) { >... > } > tmpex = tmppath [depth] .p_ext; > if (tmpex != ex) {>...> } This patch seems to be expecting that the extent map may have changed during write (which I called allocation race earlier) . However, obdfilter takes i_mutex on writes and truncates and serializes extent map updates with it. Which case/race is expected to be handled with your code that I quoted above? Thanks.
          bobijam Zhenyu Xu added a comment -

          Panda,

          I don't quite get you, i_mutex is always hold during the map block process, it is i_data_sem that be used to protect extent block usage. I don't know what "allocation race" you are referring to, would you mind elaborating it?

          bobijam Zhenyu Xu added a comment - Panda, I don't quite get you, i_mutex is always hold during the map block process, it is i_data_sem that be used to protect extent block usage. I don't know what "allocation race" you are referring to, would you mind elaborating it?
          panda Andrew Perepechko added a comment - - edited

          Zhenyu Xu, why isn't a hypothetical allocation race in ext3_ext_new_extent_cb() addressed by your patch avoided by means of i_mutex which is taken whenever Lustre calls fsfilt_map_inode_pages(create=1)?

          Thanks.

          panda Andrew Perepechko added a comment - - edited Zhenyu Xu, why isn't a hypothetical allocation race in ext3_ext_new_extent_cb() addressed by your patch avoided by means of i_mutex which is taken whenever Lustre calls fsfilt_map_inode_pages(create=1)? Thanks.

          I tend to think ldiskfs_ext_next_allocated_block() should be called under i_data_sem together with ldiskfs_ext_find_extent(), otherwise ldiskfs_ext_next_allocated_block() is working on data being modified

          bzzz Alex Zhuravlev added a comment - I tend to think ldiskfs_ext_next_allocated_block() should be called under i_data_sem together with ldiskfs_ext_find_extent(), otherwise ldiskfs_ext_next_allocated_block() is working on data being modified

          Integrated in lustre-master » x86_64,server,el5,ofa #146
          LU-216 Protect extent tree during fsfilt_ldiskfs_ext_walk_space()

          Oleg Drokin : dea0c1f27c0eb53d4738a70241ff2bcc47201862
          Files :

          • ldiskfs/kernel_patches/patches/ext4-misc-sles11.patch
          • lustre/lvfs/fsfilt_ext3.c
          • ldiskfs/kernel_patches/patches/ext4-misc-rhel6.patch
          • ldiskfs/kernel_patches/patches/ext4-misc-rhel5.patch
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,server,el5,ofa #146 LU-216 Protect extent tree during fsfilt_ldiskfs_ext_walk_space() Oleg Drokin : dea0c1f27c0eb53d4738a70241ff2bcc47201862 Files : ldiskfs/kernel_patches/patches/ext4-misc-sles11.patch lustre/lvfs/fsfilt_ext3.c ldiskfs/kernel_patches/patches/ext4-misc-rhel6.patch ldiskfs/kernel_patches/patches/ext4-misc-rhel5.patch

          Integrated in lustre-master » i686,server,el6,inkernel #146
          LU-216 Protect extent tree during fsfilt_ldiskfs_ext_walk_space()

          Oleg Drokin : dea0c1f27c0eb53d4738a70241ff2bcc47201862
          Files :

          • ldiskfs/kernel_patches/patches/ext4-misc-sles11.patch
          • lustre/lvfs/fsfilt_ext3.c
          • ldiskfs/kernel_patches/patches/ext4-misc-rhel5.patch
          • ldiskfs/kernel_patches/patches/ext4-misc-rhel6.patch
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » i686,server,el6,inkernel #146 LU-216 Protect extent tree during fsfilt_ldiskfs_ext_walk_space() Oleg Drokin : dea0c1f27c0eb53d4738a70241ff2bcc47201862 Files : ldiskfs/kernel_patches/patches/ext4-misc-sles11.patch lustre/lvfs/fsfilt_ext3.c ldiskfs/kernel_patches/patches/ext4-misc-rhel5.patch ldiskfs/kernel_patches/patches/ext4-misc-rhel6.patch

          Integrated in lustre-master » x86_64,client,el5,ofa #146
          LU-216 Protect extent tree during fsfilt_ldiskfs_ext_walk_space()

          Oleg Drokin : dea0c1f27c0eb53d4738a70241ff2bcc47201862
          Files :

          • lustre/lvfs/fsfilt_ext3.c
          • ldiskfs/kernel_patches/patches/ext4-misc-sles11.patch
          • ldiskfs/kernel_patches/patches/ext4-misc-rhel6.patch
          • ldiskfs/kernel_patches/patches/ext4-misc-rhel5.patch
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,el5,ofa #146 LU-216 Protect extent tree during fsfilt_ldiskfs_ext_walk_space() Oleg Drokin : dea0c1f27c0eb53d4738a70241ff2bcc47201862 Files : lustre/lvfs/fsfilt_ext3.c ldiskfs/kernel_patches/patches/ext4-misc-sles11.patch ldiskfs/kernel_patches/patches/ext4-misc-rhel6.patch ldiskfs/kernel_patches/patches/ext4-misc-rhel5.patch

          Integrated in lustre-master » x86_64,server,el6,inkernel #146
          LU-216 Protect extent tree during fsfilt_ldiskfs_ext_walk_space()

          Oleg Drokin : dea0c1f27c0eb53d4738a70241ff2bcc47201862
          Files :

          • ldiskfs/kernel_patches/patches/ext4-misc-sles11.patch
          • lustre/lvfs/fsfilt_ext3.c
          • ldiskfs/kernel_patches/patches/ext4-misc-rhel5.patch
          • ldiskfs/kernel_patches/patches/ext4-misc-rhel6.patch
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,server,el6,inkernel #146 LU-216 Protect extent tree during fsfilt_ldiskfs_ext_walk_space() Oleg Drokin : dea0c1f27c0eb53d4738a70241ff2bcc47201862 Files : ldiskfs/kernel_patches/patches/ext4-misc-sles11.patch lustre/lvfs/fsfilt_ext3.c ldiskfs/kernel_patches/patches/ext4-misc-rhel5.patch ldiskfs/kernel_patches/patches/ext4-misc-rhel6.patch

          Integrated in lustre-master » x86_64,server,el5,inkernel #146
          LU-216 Protect extent tree during fsfilt_ldiskfs_ext_walk_space()

          Oleg Drokin : dea0c1f27c0eb53d4738a70241ff2bcc47201862
          Files :

          • ldiskfs/kernel_patches/patches/ext4-misc-sles11.patch
          • ldiskfs/kernel_patches/patches/ext4-misc-rhel6.patch
          • ldiskfs/kernel_patches/patches/ext4-misc-rhel5.patch
          • lustre/lvfs/fsfilt_ext3.c
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,server,el5,inkernel #146 LU-216 Protect extent tree during fsfilt_ldiskfs_ext_walk_space() Oleg Drokin : dea0c1f27c0eb53d4738a70241ff2bcc47201862 Files : ldiskfs/kernel_patches/patches/ext4-misc-sles11.patch ldiskfs/kernel_patches/patches/ext4-misc-rhel6.patch ldiskfs/kernel_patches/patches/ext4-misc-rhel5.patch lustre/lvfs/fsfilt_ext3.c

          People

            bobijam Zhenyu Xu
            spitzcor Cory Spitz
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: