Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13397

lfs migrate/mirror extend/resync does not preserve sparse file



    • Bug
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • Lustre 2.15.0


      While testing "lfs migrate", "lfs mirror extend", and "lfs mirror resync", I was doing high-offset writes to initialize later PFL components of the file layout (e.g. at 1GB offset and 32GB offset). When trying to mirror/resync those files to another mirror copy, it resulted in the commands failing due to ENOSPC because there was not enough free space in the test filesystem to write out the data.

      These commands should be updated to share a common "data copy" routine (if they don't already) to reduce code duplication when fixing these issues. Then, the code needs to handle sparse input files properly, first by checking for sparse files (e.g. blocks << size) to enable checking the source file, and not copying holes in the file. There should probably be options added for each command like "--sparse=<auto,no,yes>" (default = auto) to force a specific behavior.

      Unfortunately, there is no optimal way to handle reading of sparse files in Lustre today. In all cases, it makes little sense to be doing these operations on in-use files, so there are already checks if the file is modified during migrate/mirror/resync.

      • For 1-stripe files, the ioctl(FIEMAP) will return a current map of data for the file (it flushes data if FIEMAP_FLAG_SYNC is used, but doesn't prevent further modification). Multi-stripe and PFL files return multiple maps in per-object offset order, and that is not useful if the files are using different layouts (likely a common case). Also, ZFS does not yet support FIEMAP despite some efforts in that direction.
      • Using SEEK_HOLE and SEEK_DATA would be the preferred solution, but this needs a Lustre-level update to pass these through from the client to the OST (and MDT for DoM). This is described in LU-10801, and may be able to leverage some infrastructure from the patch https://review.whamcloud.com/9275 "LU-3606 fallocate: Implement fallocate preallocate operation". While SEEK_HOLE and SEEK_DATA "work" for Lustre by kernel emulation, they just assume that every block is "data" and the first hole is the end of the file.
      • The simplest (though least efficient) option would be to do zero-block detection during the copy phase. This has quite high CPU and IO overhead, because it requires reading the whole file and checking every byte. It would only be done if the file appears to be very sparse, and the layout is complex (i.e. not single stripe where FIEMAP is useful).


        Issue Links



              tappro Mikhail Pershin
              adilger Andreas Dilger
              0 Vote for this issue
              10 Start watching this issue