Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-540

fix interop issues with lustre_user.h

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.1.0
    • Lustre 2.0.0, Lustre 2.1.0
    • None
    • 3
    • 4924

    Description

      While looking at the Lustre public headers for LU-533 I noticed several
      disturbing/sad incompatibilities in the headers between b1_8 and master.
      Some of them are non-critical:

      • ioctl numbers for lloop devices changed (LL_IOC_LLOOP_ATTACH,
        LL_IOC_LLOOP_DETACH, LL_IOC_LLOOP_INFO, LL_IOC_LLOOP_DETACH_BYDEV

      Since lloop isn't supported and usually ships with matching lctl anyway,
      there isn't a critical interop issue to be fixed, though the LLOOP_INFO
      ioctl should be fixed to report FIDs instead of inode numbers.

      Several potentially more difficult problems exist, that I believe some
      users may be impacted by:

      • struct if_quotactl has changed in size, with fields added in the middle
        of the structure instead of at the end. Some users (at least NERSC)
        are using the llapi_quotactl() interface to access quota data from
        userspace, but the userspace will break due to the change in the ioctl
        interface and llapi_quotactl() parameter. The ioctl number (correctly)
        depends on the size of struct if_quotactl, so it would be possible to
        define two slightly different ioctls at the same time (one for the old
        struct and one for the new struct). Since liblustreapi.a is a static
        library, it will continue to call the kernel with the old ioctl number
        until it is recompiled. It would be possible to define the old ioctl
        number and copy the struct data, if we wanted to maintain compatibility.

      It is not easy to resolve interoperability issues between 1.x user space
      tools and 2.x client kernel, even if the new added fields for "if_quotactl"
      are at the end of the structure. If without considering released lustre-2.0,
      we can adjust current lustre-2.1 to make it be compatible with 1.x userspace
      tools. But I do not sure whether lustre-2.0 client can be ignored or not.

      • the supplementary group downcall structure has changed in 2.0, but very
        sadly the magic number in the struct wasn't also changed, so the kernel
        can't distinguish between the old and new downcall structs. I recall
        at least Sandia had written their own 1.x group upcall, and while we
        may not need to keep compatibility, at least changing the magic number
        would afford us the option to do so. An open question is whether this
        would negatively affect Kerberos users?

      Since the downcall structure sizes for 1.x and 2.x are different, it can
      be used by MDT to distinguish whether the downcall from user space is 2.x
      based or 1.x based. We can process that in current lustre-2.1 candidate.

      • llite file flags are defined with conflicting values for b1_8 and master
        since 1.8.2:

      b1_8:
      #define LL_FILE_LOCKED_DIRECTIO 0x00000008 /* client-side locks with dio */
      #define LL_FILE_LOCKLESS_IO 0x00000010 /* server-side locks with cio */

      master:
      #define LL_FILE_RMTACL 0x00000008

      I have no idea what LL_FILE_RMTACL does, or if anybody uses it. The
      LL_FILE_LOCK* flags are (AFAIK) used by LLNL BG/P IO daemons to force
      lockless IO on files. I'm not sure if apps are using this also.

      "LL_FILE_RMTACL" is used for remote client ACL processing. Since no released
      versions support remote client, we can redefine it in lustre-2.1 candidate.

      Attachments

        Issue Links

          Activity

            People

              yong.fan nasf (Inactive)
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: