Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12580

usercopy exposure attempt detected in LL_IOC_LOV_GETSTRIPE ioctl

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.14.0, Lustre 2.12.5
    • Upstream
    • None
    • CentOS 7.6.1810, starting with kernel 3.10.0-957
    • 3
    • 9223372036854775807

    Description

      Software like darshan are still using the old `ioctl` way to gather striping info for a file.

      The kernel BUG is easily triggered on a not-PFL striped file, using:

      lum->lmm_magic = LOV_USER_MAGIC;
      lum->lmm_stripe_count = LOV_MAX_STRIPE_COUNT;
      

      Any `lmm_stripe_count` greater than the actual file's stripe count will trigger the bug.

      Kernel side the issue appears to be in `lov_getstripe`: with a positive `lum_size`(line 409), `lmm_size` is set as `lum_size`(line 442) even if `lmm_magic != LOV_MAGIC_COMP_V1`(line 414), while instead the structure is just as big as `lmmk_size`:

      404         if (lum.lmm_magic == LOV_USER_MAGIC_V1 ||
      405             lum.lmm_magic == LOV_USER_MAGIC_V3)
      406                 lum_size = lov_user_md_size(lum.lmm_stripe_count,
      407                                             lum.lmm_magic);
      408 
      409         if (lum_size != 0) {
      410                 struct lov_mds_md *comp_md = lmmk;
      411 
      412                 /* Legacy app (ADIO for instance) treats the layout as V1/V3
      413                  * blindly, we'd return a reasonable V1/V3 for them. */
      414                 if (lmmk->lmm_magic == LOV_MAGIC_COMP_V1) {
      [...]
      439                 }
      440 
      441                 lmm = comp_md;
      442                 lmm_size = lum_size;
      443         } else {
      444                 lmm = lmmk;
      445                 lmm_size = lmmk_size;
      446         }
      447         /**
      448          * User specified limited buffer size, usually the buffer is
      449          * from ll_lov_setstripe(), and the buffer can only hold basic
      450          * layout template info.
      451          */
      452         if (size == 0 || size > lmm_size)
      453                 size = lmm_size;
      454         if (copy_to_user(lump, lmm, size))
      455                 GOTO(out_free, rc = -EFAULT);
      

      Please find as attachment the kernel trace and a reproducer, to be invoked as:

      $ gcc reproducer.c -o reproducer -W -Wall --pedantic
      $ ./reproducer <file_path>
      

      Attachments

        Issue Links

          Activity

            [LU-12580] usercopy exposure attempt detected in LL_IOC_LOV_GETSTRIPE ioctl

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38050
            Subject: LU-12580 lov: fix typo in lov_comp_md_size
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 865177bca926efd4f69d4dcaa74624054a2a5aaf

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38050 Subject: LU-12580 lov: fix typo in lov_comp_md_size Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 865177bca926efd4f69d4dcaa74624054a2a5aaf
            pjones Peter Jones added a comment -

            Landed for 2.14

            pjones Peter Jones added a comment - Landed for 2.14

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37469/
            Subject: LU-12580 lov: fix out of bound usercopy
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 2f1beb33144523467b596f4b6fab882b0a839187

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37469/ Subject: LU-12580 lov: fix out of bound usercopy Project: fs/lustre-release Branch: master Current Patch Set: Commit: 2f1beb33144523467b596f4b6fab882b0a839187

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37493/
            Subject: LU-12580 lov: fix typo in lov_comp_md_size
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: d41716533682ed88b8a77654f9b5b050ef5c672c

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37493/ Subject: LU-12580 lov: fix typo in lov_comp_md_size Project: fs/lustre-release Branch: master Current Patch Set: Commit: d41716533682ed88b8a77654f9b5b050ef5c672c

            Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/37493
            Subject: LU-12580 lov: fix typo in lov_comp_md_size
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: a65fa5d21bb8776735f8d7cf3516aed0db94e64c

            gerrit Gerrit Updater added a comment - Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/37493 Subject: LU-12580 lov: fix typo in lov_comp_md_size Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: a65fa5d21bb8776735f8d7cf3516aed0db94e64c

            Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/37469
            Subject: LU-12580 lov: fix out of bound usercopy
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 93df0eba99fbc22af41d11003ba3564394979b29

            gerrit Gerrit Updater added a comment - Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/37469 Subject: LU-12580 lov: fix out of bound usercopy Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 93df0eba99fbc22af41d11003ba3564394979b29

            Sorry for the confusion; will start by clarifying the first question.

            Considering the "struct lov_user_md" passed to the ioctl invocation, it can be either "lov_user_md_v1" or "lov_user_md_v3":

            #define lov_user_md lov_user_md_v1
            
            struct lov_user_md_v1 {
                    __u32 lmm_magic;
                    __u32 lmm_pattern;
                    struct ost_id lmm_oi;
                    __u32 lmm_stripe_size;
                    __u16 lmm_stripe_count;
                    union {
                            __u16 lmm_stripe_offset;
                            __u16 lmm_layout_gen;
                    };
                    struct lov_user_ost_data_v1 lmm_objects[0];
            } __attribute__((packed,  __may_alias__));
            
            struct lov_user_md_v3 {
                    __u32 lmm_magic;
                    __u32 lmm_pattern;
                    struct ost_id lmm_oi;
                    __u32 lmm_stripe_size;
                    __u16 lmm_stripe_count;
                    union {
                            __u16 lmm_stripe_offset;
                            __u16 lmm_layout_gen;
                    };
                    char  lmm_pool_name[LOV_MAXPOOLNAME + 1];
                    struct lov_user_ost_data_v1 lmm_objects[0];
            } __attribute__((packed));
            

            Given that "lmm_stripe_offset" and "lmm_layout_gen" share a union, only one can be returned.

            Now, considering a case of "lsm_magic" either of LOV_MAGIC_V1 or LOV_MAGIC_V3, "lov_lsm_pack_v1v3" is the one invoked. In this function, "lmm_layout_gen" is the one chosen to be returned:

            lmmv1->lmm_layout_gen = cpu_to_le16(lsm->lsm_layout_gen);
            

            The fact is that majority of codes are expecting to get "lmm_stripe_offset", like ADIO:

                lumlen = sizeof(struct lov_user_md) +
                        MAX_LOV_UUID_COUNT * sizeof(struct lov_user_ost_data);
                lum = (struct lov_user_md *)ADIOI_Calloc(1,lumlen);
            
            [...]
            
                memset(lum, 0, lumlen);
                lum->lmm_magic = LOV_USER_MAGIC;
                err = ioctl(fd->fd_sys, LL_IOC_LOV_GETSTRIPE, (void *)lum);
                if (!err) {
            
            [...]
            
                    fd->hints->fs_hints.lustre.start_iodevice = lum->lmm_stripe_offset;
                    sprintf(value, "%d", lum->lmm_stripe_offset);
                    ADIOI_Info_set(fd->info, "romio_lustre_start_iodevice", value);
            

            Semantically, which one between "lmm_stripe_offset" and "lmm_layout_gen" should be the correct one to return?

            mgrossi Marco Grossi (Inactive) added a comment - Sorry for the confusion; will start by clarifying the first question. Considering the "struct lov_user_md" passed to the ioctl invocation, it can be either "lov_user_md_v1" or "lov_user_md_v3": #define lov_user_md lov_user_md_v1 struct lov_user_md_v1 { __u32 lmm_magic; __u32 lmm_pattern; struct ost_id lmm_oi; __u32 lmm_stripe_size; __u16 lmm_stripe_count; union { __u16 lmm_stripe_offset; __u16 lmm_layout_gen; }; struct lov_user_ost_data_v1 lmm_objects[0]; } __attribute__((packed, __may_alias__)); struct lov_user_md_v3 { __u32 lmm_magic; __u32 lmm_pattern; struct ost_id lmm_oi; __u32 lmm_stripe_size; __u16 lmm_stripe_count; union { __u16 lmm_stripe_offset; __u16 lmm_layout_gen; }; char lmm_pool_name[LOV_MAXPOOLNAME + 1]; struct lov_user_ost_data_v1 lmm_objects[0]; } __attribute__((packed)); Given that "lmm_stripe_offset" and "lmm_layout_gen" share a union, only one can be returned. Now, considering a case of "lsm_magic" either of LOV_MAGIC_V1 or LOV_MAGIC_V3, "lov_lsm_pack_v1v3" is the one invoked. In this function, "lmm_layout_gen" is the one chosen to be returned: lmmv1->lmm_layout_gen = cpu_to_le16(lsm->lsm_layout_gen); The fact is that majority of codes are expecting to get "lmm_stripe_offset", like ADIO: lumlen = sizeof(struct lov_user_md) + MAX_LOV_UUID_COUNT * sizeof(struct lov_user_ost_data); lum = (struct lov_user_md *)ADIOI_Calloc(1,lumlen); [...] memset(lum, 0, lumlen); lum->lmm_magic = LOV_USER_MAGIC; err = ioctl(fd->fd_sys, LL_IOC_LOV_GETSTRIPE, (void *)lum); if (!err) { [...] fd->hints->fs_hints.lustre.start_iodevice = lum->lmm_stripe_offset; sprintf(value, "%d" , lum->lmm_stripe_offset); ADIOI_Info_set(fd->info, "romio_lustre_start_iodevice" , value); Semantically, which one between "lmm_stripe_offset" and "lmm_layout_gen" should be the correct one to return?

            You asked two other things, let's see...

            Please do open an issue for the man page, or if you feel it's closely related, that's a minor enough change it could be done under this LU as a cleanup.  (Preferably in a separate patch, but still using this LU.)

            I haven't verified this, but I suspect you are getting ENODATA because there is no default layout set on the directory.

            Try setting a default layout on the directory:
            lfs setstripe -E 10M -E 100M -E -1 testdir

            Then calling the ioctl on it.

            pfarrell Patrick Farrell (Inactive) added a comment - You asked two other things, let's see... Please do open an issue for the man page, or if you feel it's closely related, that's a minor enough change it could be done under this LU as a cleanup.  (Preferably in a separate patch, but still using this LU.) I haven't verified this, but I suspect you are getting ENODATA because there is no default layout set on the directory. Try setting a default layout on the directory: lfs setstripe -E 10M -E 100M -E -1 testdir Then calling the ioctl on it.

            Those comments are... strange.  They don't reflect on what you should be returning, they're about internal use within the fs.  (I'm also not sure they're fully correct.)

            So layout_gen is generally only of interest to tools which are manipulating the layout directly - Not just doing getstripe/setstripe.  It's incremented when the layout changes, like in FLR or when a PFL component is first written to (and is instantiated).

            Stripe offset - if set - gives the offset that will be used for the first stripe of an object.  It is "-1" to indicate no preference has been indicated.  Once a component is instantiated, it is set to the first stripe of that component.  (Not 0 - I just tested to confirm, and it's zero for me only if the first stripe is on OST0.)

            Considering a PFL file, at the moment the ioctl always returns 0 for both object id and sequential; also offset is set to -1 for a not instantiated file, while it turns to 0 on first write. What should be returned here? 

            I can't quite parse this sentence.  For a PFL file, 0 is in which field, of which struct?  What do you mean by "sequential" - are you referring to a field (so, which one, in which struct)?

            The offset behavior sounds correct to me: -1 until instantiated, because no offset has been picked, then the offset of the first stripe.

            FWIW, looking at stripe offset is kind of odd.  Applications should really be concerned with the offset of all of the stripes, not just the first one for each component.  The stripes are not necessarily sequential by OST or anything, so the offset isn't enough info by itself.  It's really only interesting when you're looking at an uninstantiated layout and you want to see if a default is set.

            pfarrell Patrick Farrell (Inactive) added a comment - Those comments are... strange.  They don't reflect on what you should be returning, they're about internal use within the fs.  (I'm also not sure they're fully correct.) So layout_gen is generally only of interest to tools which are manipulating the layout directly - Not just doing getstripe/setstripe.  It's incremented when the layout changes, like in FLR or when a PFL component is first written to (and is instantiated). Stripe offset - if set - gives the offset that will be used for the first stripe of an object.  It is "-1" to indicate no preference has been indicated.  Once a component is instantiated, it is set to the first stripe of that component.  (Not 0 - I just tested to confirm, and it's zero for me only if the first stripe is on OST0.) Considering a PFL file, at the moment the ioctl always returns 0 for both object id and sequential; also offset is set to -1 for a not instantiated file, while it turns to 0 on first write. What should be returned here? I can't quite parse this sentence.  For a PFL file, 0 is in which field, of which struct?  What do you mean by "sequential" - are you referring to a field (so, which one, in which struct)? The offset behavior sounds correct to me: -1 until instantiated, because no offset has been picked, then the offset of the first stripe. FWIW, looking at stripe offset is kind of odd.  Applications should really be concerned with the offset of all of the stripes, not just the first one for each component.  The stripes are not necessarily sequential by OST or anything, so the offset isn't enough info by itself.  It's really only interesting when you're looking at an uninstantiated layout and you want to see if a default is set.

            Forgot to ask last bit: currently is returning -ENODATA on a directory. Is this expected?

            mgrossi Marco Grossi (Inactive) added a comment - Forgot to ask last bit: currently is returning -ENODATA on a directory. Is this expected?

            I've few questions about the expected behavior of this ioctl LL_IOC_LOV_GETSTRIPE

            `lov_getstripe` is currently setting `lmm_layout_gen`, via `lov_lsm_pack`, but looking at the variety of codes using this ioctl, like ADIO and darshan, they are mainly expecting to get a value for `lmm_stripe_offset` instead. 

            The comments associated to "lov_user_md_v1" and "v3" are as following:

            __u16 lmm_stripe_offset; /* starting stripe offset in
                                      * lmm_objects, use when writing */
            __u16 lmm_layout_gen;    /* layout generation number
                                      * used when reading */
            

            Can you clarify which one is to return?

            Considering a PFL file, at the moment the ioctl always returns 0 for both object id and sequential; also offset is set to -1 for a not instantiated file, while it turns to 0 on first write. What should be returned here?

            On a side note, man for "llapi_file_get_stripe.3" still reports old data structure. Should I open an issue for this?

            mgrossi Marco Grossi (Inactive) added a comment - I've few questions about the expected behavior of this ioctl LL_IOC_LOV_GETSTRIPE `lov_getstripe` is currently setting `lmm_layout_gen`, via `lov_lsm_pack`, but looking at the variety of codes using this ioctl, like ADIO and darshan, they are mainly expecting to get a value for `lmm_stripe_offset` instead.  The comments associated to "lov_user_md_v1" and "v3" are as following: __u16 lmm_stripe_offset; /* starting stripe offset in * lmm_objects, use when writing */ __u16 lmm_layout_gen; /* layout generation number * used when reading */ Can you clarify which one is to return? Considering a PFL file, at the moment the ioctl always returns 0 for both object id and sequential; also offset is set to -1 for a not instantiated file, while it turns to 0 on first write. What should be returned here? On a side note, man for "llapi_file_get_stripe.3" still reports old data structure. Should I open an issue for this?

            People

              mgrossi Marco Grossi (Inactive)
              mgrossi Marco Grossi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: