Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10884

stat() on lustre mount point / limited client trust in l_getidentity

Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.10.3
    • None
    • CentOS 7.4
    • 9223372036854775807

    Description

      Hi!

      On our Oak storage system, which is a global storage system with limited access, we use the nodemap feature for GIDs, and UIDs are only a subset of the ones available on the different client clusters. A recent compatibility issue with Singularity (https://github.com/singularityware/singularity/issues/1313 if you're interested in the whole story) led to the discovery of stat() failing on the client /oak mount point from time to time. It wasn't a problem so far until we hit this issue with Singularity. Correct me if I'm wrong, but the issue is that the MDT will refuse to answer any rpc from an unknown UID, leading to stat() on /oak returning EPERM. This leads to things like this for unknown UIDs:

      [user@sh-104-49 ~]$ ls -l /
      ls: cannot access /oak: Permission denied
      total 44
      ...
      d??????????   ? ?    ?        ?            ? oak
      ...
      

      I said from time to time, because IF a user with Oak access did previously run Singularity on this compute node, thus (I believe) populating the client inode cache, stat() would then work even for unknown users. As a non-reproductible issue, it has been painful to troubleshoot.

      Anyway, we recently fixed the issue by forking l_getidentity.c to allow unknown UIDs to query the MDT so that stat() on the mount point '/oak' doesn't fail:

      diff --git a/lustre/utils/l_getidentity.c b/lustre/utils/l_getidentity.c
      index 6aca6dc..72896a8 100644
      --- a/lustre/utils/l_getidentity.c
      +++ b/lustre/utils/l_getidentity.c
      @@ -111,9 +111,11 @@ int get_groups_local(struct identity_downcall_data *data,
       
              pw = getpwuid(data->idd_uid);
              if (!pw) {
      -               errlog("no such user %u\n", data->idd_uid);
      -               data->idd_err = errno ? errno : EIDRM;
      -               return -1;
      +               /* Stanford limited client trust: all uid are mapped with primary group 37 */
      +               errlog("warning: no secondary groups for unknown user %u\n", data->idd_uid);
      +               data->idd_gid = 37;
      +               data->idd_ngroups = 0;
      +               return 0;
              }
       
              data->idd_gid = pw->pw_gid;
      

      Because all access control is done using UID and secondary GIDs, we should be good. Now stat() does work on every host mount points, making Singularity happy to run with autofs.

      So I wanted to raise the issue here to know what you think about this issue? Maybe Lustre filesystems should allow the stat rpc from unknown users on its root directory? Or would it make sense to add this kind of limited UID trust to l_getidentity?

      Thanks!
      Stephane

      Attachments

        Activity

          [LU-10884] stat() on lustre mount point / limited client trust in l_getidentity
          sthiell Stephane Thiell added a comment - - edited

          Hi John - Thanks for your reply, that's useful.

          Note that we're not using nodemap for UIDs, only GIDs (we use gid_only), so that won't work I think. What we would need is some kind of default squash_uid.

          I agree too with the statement "If you can see it then you should be able to stat it."

          Thanks,
          Stephane

          sthiell Stephane Thiell added a comment - - edited Hi John - Thanks for your reply, that's useful. Note that we're not using nodemap for UIDs, only GIDs (we use gid_only), so that won't work I think. What we would need is some kind of default squash_uid. I agree too with the statement "If you can see it then you should be able to stat it." Thanks, Stephane
          jhammond John Hammond added a comment -

          Hi Stephane,

          Have you tried using the squash_uid and squash_gid properties of the nodemap here?

          > I said from time to time, because IF a user with Oak access did previously run Singularity on this compute node, thus (I believe) populating the client inode cache, stat() would then work even for unknown users. As a non-reproductible issue, it has been painful to troubleshoot.

          Indeed, this is a (perhaps poorly) known limitation of nodemapping/Lustre permissions/caching.

          > So I wanted to raise the issue here to know what you think about this issue? Maybe Lustre filesystems should allow the stat rpc from unknown users on its root directory? Or would it make sense to add this kind of limited UID trust to l_getidentity?

          I agree with the premise that "If you can see it then you should be able to stat it." But I would be hesitant to poke holes in the security policy for special cases such as this. If we go down this road then we should make sure that it interacts properly with filesets.

          jhammond John Hammond added a comment - Hi Stephane, Have you tried using the squash_uid and squash_gid properties of the nodemap here? > I said from time to time, because IF a user with Oak access did previously run Singularity on this compute node, thus (I believe) populating the client inode cache, stat() would then work even for unknown users. As a non-reproductible issue, it has been painful to troubleshoot. Indeed, this is a (perhaps poorly) known limitation of nodemapping/Lustre permissions/caching. > So I wanted to raise the issue here to know what you think about this issue? Maybe Lustre filesystems should allow the stat rpc from unknown users on its root directory? Or would it make sense to add this kind of limited UID trust to l_getidentity? I agree with the premise that "If you can see it then you should be able to stat it." But I would be hesitant to poke holes in the security policy for special cases such as this. If we go down this road then we should make sure that it interacts properly with filesets.

          People

            pjones Peter Jones
            sthiell Stephane Thiell
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: