Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14407

osd-zfs: Direct IO

    XMLWordPrintable

Details

    • New Feature
    • Resolution: Unresolved
    • Minor
    • None
    • Upstream
    • None

    Description

      We're getting close to integrating proper direct IO support for ZFS and I wanted to start a conversation about how Lustre can best take advantage of it for very fast SSD/NVMe devices.

      From a functionality perspective we've implemented Direct IO such that it entirely bypasses the ARC and avoids as many copies as possible. This includes the copy between user and kernel space (not really an issue for Lustre) as well as any copies in the IO pipeline. Obviously, if features like compression or encryption are enabled those transforms of the data still need to happen. But if not then we'll do the IO to disk with the provided user pages, or in Lustre's case, the pages from the loaned ARC buffer.

      The code in the OpenZFS Direct IO PR makes no functional changes to the ZFS interfaces Lustre is currently using. So when the PR is merged Lustre's behavior when using ZFS OSSs shouldn't change at all. What we have done is provide a couple new interfaces that Lustre can optionally use to request Direct IO on a per dbuf basis.

      We've done some basic initial performance testing by forcing Lustre to always use the new Direct IO paths and have seen very good results. But I think what we really want is for Lustre to somehow more intelligently control which IOs are submitted as buffered and which are are direct. ZFS will guarantee coherency between buffered and direct IOs so it's mainly a matter of how best to issue them.

      One idea would be to integrate with Lustre's existing readcache_max_filesize, read_cache_enable and writethrough_cache_enable tunables but I don't know how practical that would be. In the short term I can propose a small patch which takes the simplest route and lets us enable/disable it for all IOs. That should provide a reasonable starting place to checkout the new interfaces and hopefully we can take it from there.

      Attachments

        Activity

          People

            behlendorf Brian Behlendorf
            behlendorf Brian Behlendorf
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: