Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2871

Data can't be striped across all the OSTs correctly by running "lfs setstripe -c -1 -i n" (n>0)

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.4.0
    • Lustre 2.4.0
    • 3
    • 6940

    Description

      I found this problem during the investigation on LU-2809. While running "lfs setstripe -c -1 -i n testfile", if ost index n doesn't start from 0, it shows that data can't be striped across all the OSTs and OST0 is always ignored.

      Attachments

        Issue Links

          Activity

            [LU-2871] Data can't be striped across all the OSTs correctly by running "lfs setstripe -c -1 -i n" (n>0)
            emoly.liu Emoly Liu added a comment -

            Landed for 2.4

            emoly.liu Emoly Liu added a comment - Landed for 2.4
            emoly.liu Emoly Liu added a comment -

            Sure, I made the both changes in the patch and will update it per Ned Bass' advice later. Thanks!

            emoly.liu Emoly Liu added a comment - Sure, I made the both changes in the patch and will update it per Ned Bass' advice later. Thanks!

            liuying, it's better to mark index used after successful lod_qos_declare_object_on(). and I don't think this is an option to the change by Zhenyu Xu, I think the both changes should be applied.

            bzzz Alex Zhuravlev added a comment - liuying, it's better to mark index used after successful lod_qos_declare_object_on(). and I don't think this is an option to the change by Zhenyu Xu, I think the both changes should be applied.
            emoly.liu Emoly Liu added a comment - Patch is at http://review.whamcloud.com/5554
            emoly.liu Emoly Liu added a comment -

            I will add a sanity test for this case.

            emoly.liu Emoly Liu added a comment - I will add a sanity test for this case.
            emoly.liu Emoly Liu added a comment - - edited

            Another way, run lod_qos_ost_in_use() after lod_qos_is_ost_used() check, right?

            diff --git a/lustre/lod/lod_qos.c b/lustre/lod/lod_qos.c
            index 2b81ad8..92b3b36 100644
            --- a/lustre/lod/lod_qos.c
            +++ b/lustre/lod/lod_qos.c
            @@ -887,6 +887,7 @@ repeat_find:
                             */
                            if (lod_qos_is_ost_used(env, ost_idx, stripe_num))
                                    continue;
            +               lod_qos_ost_in_use(env, stripe_num, ost_idx);
             
                            /* Drop slow OSCs if we can, but not for requested start idx.
                             *
            
            emoly.liu Emoly Liu added a comment - - edited Another way, run lod_qos_ost_in_use() after lod_qos_is_ost_used() check, right? diff --git a/lustre/lod/lod_qos.c b/lustre/lod/lod_qos.c index 2b81ad8..92b3b36 100644 --- a/lustre/lod/lod_qos.c +++ b/lustre/lod/lod_qos.c @@ -887,6 +887,7 @@ repeat_find: */ if (lod_qos_is_ost_used(env, ost_idx, stripe_num)) continue ; + lod_qos_ost_in_use(env, stripe_num, ost_idx); /* Drop slow OSCs if we can, but not for requested start idx. *

            pretty much correct. please put a patch into gerrit, thanks.

            bzzz Alex Zhuravlev added a comment - pretty much correct. please put a patch into gerrit, thanks.
            bobijam Zhenyu Xu added a comment - - edited

            I found the root cause.

            in lod_qos_ost_in_use_clear(), the ost_in_use array is initialised to 0, and in lod_qos_prep_create()->old_alloc_specific(), the ost_idx is

                    for (i = 0; i < ost_count;
                                    i++, array_idx = (array_idx + 1) % ost_count) {
                            ost_idx = osts->op_array[array_idx];
            

            and the ost_idx will be checked upon ost_in_use array

                            if (lod_qos_is_ost_used(env, ost_idx, stripe_num))
                                    continue;
            

            If the stripe_offset starts from 0, and in the 1st iteration, stripe_num is also 0, and lod_qos_is_ost_used() will return false, then object will be allocated on the first OST device.

            While if file stripe starting from a number other than 0, when the loop comes to which ost_idx is 0, the lod_qos_is_ost_used(env, 0, stripe_num) will return true, and the 1st OST device will be skipped.

            The fix should be in lod_qos_ost_in_use_clear(). With following patch, the object stripe allocation will be correct.

            diff --git a/lustre/lod/lod_qos.c b/lustre/lod/lod_qos.c
            index 2b81ad8..2f46e7c 100644
            --- a/lustre/lod/lod_qos.c
            +++ b/lustre/lod/lod_qos.c
            @@ -629,7 +629,7 @@ static inline int lod_qos_ost_in_use_clear(const struct lu_env *env, int stripes
                            CERROR("can't allocate memory for ost-in-use array\n");
                            return -ENOMEM;
                    }
            -       memset(info->lti_ea_store, 0, sizeof(int) * stripes);
            +       memset(info->lti_ea_store, -1, sizeof(int) * stripes);
                    return 0;
             }
            
            bobijam Zhenyu Xu added a comment - - edited I found the root cause. in lod_qos_ost_in_use_clear(), the ost_in_use array is initialised to 0, and in lod_qos_prep_create()->old_alloc_specific(), the ost_idx is for (i = 0; i < ost_count; i++, array_idx = (array_idx + 1) % ost_count) { ost_idx = osts->op_array[array_idx]; and the ost_idx will be checked upon ost_in_use array if (lod_qos_is_ost_used(env, ost_idx, stripe_num)) continue ; If the stripe_offset starts from 0, and in the 1st iteration, stripe_num is also 0, and lod_qos_is_ost_used() will return false, then object will be allocated on the first OST device. While if file stripe starting from a number other than 0, when the loop comes to which ost_idx is 0, the lod_qos_is_ost_used(env, 0, stripe_num) will return true, and the 1st OST device will be skipped. The fix should be in lod_qos_ost_in_use_clear(). With following patch, the object stripe allocation will be correct. diff --git a/lustre/lod/lod_qos.c b/lustre/lod/lod_qos.c index 2b81ad8..2f46e7c 100644 --- a/lustre/lod/lod_qos.c +++ b/lustre/lod/lod_qos.c @@ -629,7 +629,7 @@ static inline int lod_qos_ost_in_use_clear( const struct lu_env *env, int stripes CERROR( "can't allocate memory for ost-in-use array\n" ); return -ENOMEM; } - memset(info->lti_ea_store, 0, sizeof( int ) * stripes); + memset(info->lti_ea_store, -1, sizeof( int ) * stripes); return 0; }

            People

              emoly.liu Emoly Liu
              emoly.liu Emoly Liu
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: