Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-137

ioctl passthrough mechanism for Lustre OST/MDT mountpoints

Details

    • New Feature
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • Lustre 2.1.0, Lustre 2.5.0
    • 14,489
    • 8383

    Description

      Implement an interface for sending IO Control (ioctl) commands from userspace through the Lustre mount point to the underlying ldiskfs filesystem to allow execution of filesystem-wide ioctl() commands, such as resize. This will allow user-space tools that operate via ioctl() commands on the filesystem mountpoint to be used on the Lustre MDT and OST filesystems while they are mounted and in use subject to any limitations of the original ioctl() commands themselves.

      Attachments

        Issue Links

          Activity

            [LU-137] ioctl passthrough mechanism for Lustre OST/MDT mountpoints

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58450
            Subject: LU-137 osd: better stat info for server mountpoints
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: dc0ae67cfd02662b863e86327bc8bcc98dcd7561

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58450 Subject: LU-137 osd: better stat info for server mountpoints Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: dc0ae67cfd02662b863e86327bc8bcc98dcd7561
            pjones Peter Jones added a comment -

            Landed for 2.16. It's been a while since I closed a Jira ticket with a bugzilla id!

            pjones Peter Jones added a comment - Landed for 2.16. It's been a while since I closed a Jira ticket with a bugzilla id!

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/20161/
            Subject: LU-137 osd-ldiskfs: pass through resize ioctl
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: ac0380dc519aa15310670d164e98453861ef332a

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/20161/ Subject: LU-137 osd-ldiskfs: pass through resize ioctl Project: fs/lustre-release Branch: master Current Patch Set: Commit: ac0380dc519aa15310670d164e98453861ef332a
            adilger Andreas Dilger added a comment - - edited

            It may be that this has been fixed via patch https://review.whamcloud.com/33131 "Subject: LU-11355 lustre: enable fstrim on lustre device", which added a generic ioctl passthrough from userspace to ldiskfs.

            However, the last time I had tested this (several years ago) there were also some issues with e2fsprogs/resize2fs being unhappy that the block device (st_rdev) reported by the Lustre server stub mount did not match the underlying block device. That is what patch: http://review.whamcloud.com/20161 "LU-137 osd: better stat info for server mountpoints" was about, but I haven't updated it in several years.

            adilger Andreas Dilger added a comment - - edited It may be that this has been fixed via patch https://review.whamcloud.com/33131 " Subject: LU-11355 lustre: enable fstrim on lustre device ", which added a generic ioctl passthrough from userspace to ldiskfs. However, the last time I had tested this (several years ago) there were also some issues with e2fsprogs/resize2fs being unhappy that the block device ( st_rdev ) reported by the Lustre server stub mount did not match the underlying block device. That is what patch: http://review.whamcloud.com/20161 " LU-137 osd: better stat info for server mountpoints " was about, but I haven't updated it in several years.

            Linking this since I plan on fixing the ioctl direction issue which will provide a proper interface for this as well.

            simmonsja James A Simmons added a comment - Linking this since I plan on fixing the ioctl direction issue which will provide a proper interface for this as well.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23092/
            Subject: LU-137 obdclass: add dt_object_put() and use it
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 5963af745b3aa14410d5ceb66f8a7b7d6aaf576a

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23092/ Subject: LU-137 obdclass: add dt_object_put() and use it Project: fs/lustre-release Branch: master Current Patch Set: Commit: 5963af745b3aa14410d5ceb66f8a7b7d6aaf576a

            Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/23092
            Subject: LU-137 obdclass: add dt_object_put() and use it
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: cf4f59a5a7253e18ac98983216791cb730511d18

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/23092 Subject: LU-137 obdclass: add dt_object_put() and use it Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: cf4f59a5a7253e18ac98983216791cb730511d18

            Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/20161
            Subject: LU-137 osd: better stat info for server mountpoints
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: db5b4aaa6177d9ab179b46a8d3e5d13c5d2c2883

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/20161 Subject: LU-137 osd: better stat info for server mountpoints Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: db5b4aaa6177d9ab179b46a8d3e5d13c5d2c2883

            Alex, what about adding a new dt_ioctl() method for the OSD API? It would of course be fine if the underlying OSD doesn't support the given ioctl (e.g. returns -ENOTTY) but gives a way to add features like this. I see that this was handled for FIEMAP by adding a dbo_fiemap_get() method (though it could return -EOPNOTSUPP a LOT earlier in the request processing), but that might result in an explosion of different methods if we add one for every ioctl. Other candidates that we might need in the future include FITRIM to trim unallocated space for SSD or thin-provisioned devices, EXT4_IOC_PRECACHE_EXTENTS to prefetch file extent metadata, EXT4_IOC_MOVE_EXT or EXT4_IOC_MIGRATE for data migration within the OST.

            It isn't yet clear to me if we want separate ioctl methods for the whole device and per file, or if it is OK to just do the "device" ioctls the root inode.

            adilger Andreas Dilger added a comment - Alex, what about adding a new dt_ioctl() method for the OSD API? It would of course be fine if the underlying OSD doesn't support the given ioctl (e.g. returns -ENOTTY) but gives a way to add features like this. I see that this was handled for FIEMAP by adding a dbo_fiemap_get() method (though it could return -EOPNOTSUPP a LOT earlier in the request processing), but that might result in an explosion of different methods if we add one for every ioctl. Other candidates that we might need in the future include FITRIM to trim unallocated space for SSD or thin-provisioned devices, EXT4_IOC_PRECACHE_EXTENTS to prefetch file extent metadata, EXT4_IOC_MOVE_EXT or EXT4_IOC_MIGRATE for data migration within the OST. It isn't yet clear to me if we want separate ioctl methods for the whole device and per file, or if it is OK to just do the "device" ioctls the root inode.

            I don't think you want to manage ZFS pools (or btrfs devices) directly. we can provide vfsmount or superblock via osd_conf_get(), but again that will work for ldiskfs only.

            bzzz Alex Zhuravlev added a comment - I don't think you want to manage ZFS pools (or btrfs devices) directly. we can provide vfsmount or superblock via osd_conf_get(), but again that will work for ldiskfs only.

            Hi Alex,

            This patch: http://review.whamcloud.com/#/c/8286/2 removes lsi_srv_mnt, lmi_mnt and ddp_mnt. It is mentioned in the commit message that vfsmount has become redundant because of the introduction of local storage device.
            IOCTL passthrough patch needs a valid vfsmount.

            The following is per discussion with Andreas:
            Seems ext4_ioctl() needs filp for mnt_{want,drop}_write_file(), which uses file->f_path.mnt for a number of things, so it really needs a valid vfsmnt structure. It looks like the vfsmount is still available in osd_dt_dev(lsi->lsi_dt_dev)->od_mnt, but it looks like od_mnt is specific to the underlying OSD device (ldiskfs or ZFS). It looks like there are no direct ioctls for the DMU - they are all handled via /dev/zfs. there are two ioctls for ZFS files in ZPL, but we don't use that, so this only needs to work for ldiskfs mountpoints for now. But, it isn't safe to dig into the OSD structure directly, since this could crash on a ZFS-backed filesystem. It might make sense to add the vfsmnt into dt_device_param. We might need to add a ->dt_ioctl() method to dt_device_operations, since accessing the superblock directly is also bad.

            Sadly, looking at this patch (LU-137) in light of ZFS-backed devices (i.e. anything 2.4 and later), it seems that there are quite a few things that are not "right" about it. ZFS doesn't even have a superblock, so that makes much of the patch invalid. Some of the info can be fetched via the osd_conf_get() interface, in particular s_dev is important for the resize ioctl. Note that even accessing the osd_device outside of the OSD code isn't possible, because the osd_device structure is different for each OSD.

            What would be the best way to proceed in this case?

            spimpale Swapnil Pimpale (Inactive) added a comment - - edited Hi Alex, This patch: http://review.whamcloud.com/#/c/8286/2 removes lsi_srv_mnt, lmi_mnt and ddp_mnt. It is mentioned in the commit message that vfsmount has become redundant because of the introduction of local storage device. IOCTL passthrough patch needs a valid vfsmount. The following is per discussion with Andreas: Seems ext4_ioctl() needs filp for mnt_{want,drop}_write_file(), which uses file->f_path.mnt for a number of things, so it really needs a valid vfsmnt structure. It looks like the vfsmount is still available in osd_dt_dev(lsi->lsi_dt_dev)->od_mnt, but it looks like od_mnt is specific to the underlying OSD device (ldiskfs or ZFS). It looks like there are no direct ioctls for the DMU - they are all handled via /dev/zfs. there are two ioctls for ZFS files in ZPL, but we don't use that, so this only needs to work for ldiskfs mountpoints for now. But, it isn't safe to dig into the OSD structure directly, since this could crash on a ZFS-backed filesystem. It might make sense to add the vfsmnt into dt_device_param. We might need to add a ->dt_ioctl() method to dt_device_operations, since accessing the superblock directly is also bad. Sadly, looking at this patch ( LU-137 ) in light of ZFS-backed devices (i.e. anything 2.4 and later), it seems that there are quite a few things that are not "right" about it. ZFS doesn't even have a superblock, so that makes much of the patch invalid. Some of the info can be fetched via the osd_conf_get() interface, in particular s_dev is important for the resize ioctl. Note that even accessing the osd_device outside of the OSD code isn't possible, because the osd_device structure is different for each OSD. What would be the best way to proceed in this case?

            People

              adilger Andreas Dilger
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: