Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5262

setfacl on striped directory will cause the striped directory crash

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.6.0
    • None
    • 3
    • 14688

    Description

      The mdd_acl_set() sets size unexpectedly on the master object and cause the master object's directory content is truncated.

      This corruption can be easily reproduced by the sanity test_125, unfortunately, there is no way to read the master object directory content directly until we work on the patch for LU-5223.

      Attachments

        Activity

          [LU-5262] setfacl on striped directory will cause the striped directory crash

          The patch has been landed to master

          yong.fan nasf (Inactive) added a comment - The patch has been landed to master
          di.wang Di Wang added a comment -

          Ah it seems the problem is that in lod_attr_get, it tries to merge the size from all sub-stripes, then in lod_attr_set it set the merged size to master object.
          probably we can just simply drop the merging in lod_attr_get, which is actually unnecessary, see http://review.whamcloud.com/#/c/10841/ And this will also fix LU-5130

          di.wang Di Wang added a comment - Ah it seems the problem is that in lod_attr_get, it tries to merge the size from all sub-stripes, then in lod_attr_set it set the merged size to master object. probably we can just simply drop the merging in lod_attr_get, which is actually unnecessary, see http://review.whamcloud.com/#/c/10841/ And this will also fix LU-5130

          MDD/LOD/OFD should not try to do this of course. I'm saying about something more like LASSERT() in OSD.

          bzzz Alex Zhuravlev added a comment - MDD/LOD/OFD should not try to do this of course. I'm saying about something more like LASSERT() in OSD.

          Then all OSDs need to check LA_SIZE. Why not control that in MDD/LOD, then no need to worry about new OSD in future.

          yong.fan nasf (Inactive) added a comment - Then all OSDs need to check LA_SIZE. Why not control that in MDD/LOD, then no need to worry about new OSD in future.

          probably it makes sense to check in OSD for LA_SIZE in case of non-regular file.

          bzzz Alex Zhuravlev added a comment - probably it makes sense to check in OSD for LA_SIZE in case of non-regular file.

          The basic idea of the patch is that: "NOT set size on directory".

          The LU-5223 depends on this patch, because if without the patch, the master object may be truncated unexpectedly when setfacl, then the iteration the master object to build LMV EA will get failure because the master directory (for shards FID and name) will be regarded empty although it is not empty.

          Sanity test_125 test such case, that is why LU-5223 failed at sanity.

          yong.fan nasf (Inactive) added a comment - The basic idea of the patch is that: "NOT set size on directory". The LU-5223 depends on this patch, because if without the patch, the master object may be truncated unexpectedly when setfacl, then the iteration the master object to build LMV EA will get failure because the master directory (for shards FID and name) will be regarded empty although it is not empty. Sanity test_125 test such case, that is why LU-5223 failed at sanity.

          Here is the patch:
          http://review.whamcloud.com/10870

          It is an important patch, should be landed to Lustre-2.6

          yong.fan nasf (Inactive) added a comment - Here is the patch: http://review.whamcloud.com/10870 It is an important patch, should be landed to Lustre-2.6

          Isn't the fix here to clear LA_SIZE (and most others except LA_MODE) from attr->la_valid bits in mdd_xattr_set() before mdo_attr_set() is called so that these values are not all changed? Otherwise, there is a chance that this will set stale values into the inode if it is changed concurrently (e.g. pdirops). It isn't clear why LU-5223 is needed to fix this?

          adilger Andreas Dilger added a comment - Isn't the fix here to clear LA_SIZE (and most others except LA_MODE) from attr->la_valid bits in mdd_xattr_set() before mdo_attr_set() is called so that these values are not all changed? Otherwise, there is a chance that this will set stale values into the inode if it is changed concurrently (e.g. pdirops). It isn't clear why LU-5223 is needed to fix this?

          People

            yong.fan nasf (Inactive)
            yong.fan nasf (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: