[LU-9584] stripe count defaults to 165 Created: 02/Jun/17  Updated: 29/Jun/17  Resolved: 29/Jun/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Mahmoud Hanafi Assignee: Jian Yu
Resolution: Not a Bug Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The filesystem has 312 OSTs. When creating a file like this

 lfs setstripe -c -1 testfile2

Gives only 165 OSTs. This is odd since it is not 160 or 312.

lfs getstripe testfile2
testfile2
lmm_stripe_count:   165
lmm_stripe_size:    1048576
lmm_pattern:        1
lmm_layout_gen:     0
lmm_stripe_offset:  68
	obdidx		 objid		 objid		 group
	    68	      67532793	    0x40677f9	             0
....

 


 Comments   
Comment by Andreas Dilger [ 02/Jun/17 ]

If you have more than 160 OSTs and want to stripe over all of them, then you need to enable the "ea_inode" feature on the MDS, like:
[noformat}
tune2fs -O ea_inode /path/to/MDT


Comment by Peter Jones [ 02/Jun/17 ]

Assigning to Jian for any follow on quesitons

Comment by Mahmoud Hanafi [ 05/Jun/17 ]

if we don't have ea_inode option why is it 165 shouldn't be 160

Comment by Jian Yu [ 06/Jun/17 ]

Hi Mahmoud,

The maximal possible stripe count is calculated based on max EA size in lod_get_stripecnt():

__u16 lod_get_stripecnt(struct lod_device *lod, struct lod_object *lo,
                        __u16 stripe_count)
{
        __u32 max_stripes = LOV_MAX_STRIPE_COUNT_OLD;
        ......
        /* stripe count is based on whether OSD can handle larger EA sizes */
        if (lod->lod_osd_max_easize > 0) {
                unsigned int easize = lod->lod_osd_max_easize;
                ......
                max_stripes = lov_mds_md_max_stripe_count(easize, LOV_MAGIC_V3);
        }

        return (stripe_count < max_stripes) ? stripe_count : max_stripes;
}

In lov_mds_md_max_stripe_count():

static inline __u32
lov_mds_md_max_stripe_count(size_t buf_size, __u32 lmm_magic)
{
        ......
        case LOV_MAGIC_V3: {
                struct lov_mds_md_v3 lmm;

                if (buf_size < sizeof(lmm))
                        return 0;

                return (buf_size - sizeof(lmm)) / sizeof(lmm.lmm_objects[0]);
        }
        ......
}

And lod->lod_osd_max_easize is assigned from ddp.ddp_max_ea_size, which is calculated in osd_conf_get() as follows:

static void osd_conf_get(const struct lu_env *env,
                         const struct dt_device *dev,
                         struct dt_device_param *param)
{
        struct super_block *sb = osd_sb(osd_dt_dev(dev));
        int                ea_overhead;
        ........
        /* LOD might calculate the max stripe count based on max_ea_size,
         * so we need take account in the overhead as well,
         * xattr_header + magic + xattr_entry_head */
        ea_overhead = sizeof(struct ldiskfs_xattr_header) + sizeof(__u32) +
                      LDISKFS_XATTR_LEN(XATTR_NAME_MAX_LEN);

#if defined(LDISKFS_FEATURE_INCOMPAT_EA_INODE)
        if (LDISKFS_HAS_INCOMPAT_FEATURE(sb, LDISKFS_FEATURE_INCOMPAT_EA_INODE))
                param->ddp_max_ea_size = LDISKFS_XATTR_MAX_LARGE_EA_SIZE -
                                                                ea_overhead;
        else
#endif
                param->ddp_max_ea_size = sb->s_blocksize - ea_overhead;
}

So, without enabling the "ea_inode" feature on MDT, the maximum stripe count is calculated as follows:

max_stripes = (sb->s_blocksize - (sizeof(struct ldiskfs_xattr_header) + sizeof(__u32) + LDISKFS_XATTR_LEN(XATTR_NAME_MAX_LEN)) - sizeof(lmm)) / sizeof(lmm.lmm_objects[0])
                    = (4096 - (32 + 4 +48) - 48) / 24
                    = 165
Comment by Mahmoud Hanafi [ 29/Jun/17 ]

We can close this case.

Comment by Peter Jones [ 29/Jun/17 ]

Thanks Mahmoud

Generated at Sat Feb 10 02:27:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.