Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11621

Add copy_file_range() API and use it for lfs migrate and mirror resync

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Rank (Obsolete):
      9223372036854775807

      Description

      The copy_file_range() API was added in kernel 4.4 and allows copying data between files without copying the data into userspace and then copying it back to the kernel again. This should significantly speed up file copy operations, for applications which support this interface, and for Lustre tools such as lfs migrate and lfs mirror resync that may be copying a lot of file data. While migrate and resync use O_DIRECT to avoid the data copy to/from userspace, this has other unfortunate effects such as causing sync read and write operations and has often been slower than doing the data copy with async writes (e.g. LU-10278).

      The copy_file_range() API in theory allows server-side offload of data copies between files in the same filesystem (and in the near future possibly between different filesystems), which is implemented for NFS and CIFS, and this could be tied into the HSM copytool via LU-6081 to avoid the need to copy the data to/from the client. The copytool itself could also be modified to use copy_file_range() to avoid the data copy on the HSM agent node to improve efficiency once the basic API is available.

      This ticket should focus on implementing the basic API and its use with a few of the built-in tools, and separate tickets can be used to implement this in the lhsmtool-posix copytool and pushing the copy action over to HSM agent nodes.

      commit 29732938a6289a15e907da234d6692a2ead71855
      Author:     Zach Brown <zab@redhat.com>
      AuthorDate: Tue Nov 10 16:53:30 2015 -0500
      
          vfs: add copy_file_range syscall and vfs helper
          
          Add a copy_file_range() system call for offloading copies between
          regular files.
          
          This gives an interface to underlying layers of the storage stack which
          can copy without reading and writing all the data.  There are a few
          candidates that should support copy offloading in the nearer term:
          
          - btrfs shares extent references with its clone ioctl
          - NFS has patches to add a COPY command which copies on the server
          - SCSI has a family of XCOPY commands which copy in the device
          
          This system call avoids the complexity of also accelerating the creation
          of the destination file by operating on an existing destination file
          descriptor, not a path.
          
          Currently the high level vfs entry point limits copy offloading to files
          on the same mount and super (and not in the same file).  This can be
          relaxed if we get implementations which can copy between file systems
          safely.
      

      Later patches implement the ->copy_file_range method for various filesystems:

      3db11b2eecc0 btrfs: add .copy_file_range file operation
      2e72448b07dc NFS: Add COPY nfs operation
      9fe26045e98f xfs: add clone file and clone range vfs functions
      620d8745b35d cifs: Introduce cifs_copy_file_range()
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                simmonsja James A Simmons
                Reporter:
                adilger Andreas Dilger
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated: