Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-682

optimization for Lustre-tar on completely sparse files.

Details

    • Improvement
    • Resolution: Won't Fix
    • Minor
    • None
    • None
    • None
    • 9728

    Description

      Kit Westneat commented:

      "Older versions of tar have to read in the entire file to figure out
      what parts are sparse. Newer versions should skip that if the # of
      blocks are 0, but I'm not sure if that made it into lustre-tar yet.

      Here's the patch:
      http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html
      "

      This hasn't made it into lustre-tar yet, and may be worth looking into.

      Attachments

        Issue Links

          Activity

            [LU-682] optimization for Lustre-tar on completely sparse files.

            No longer relevant: tar with sparse is used to file-level backup MDT. Restoring a MDT from a file-level backup is only supported on 2.3 and beyond. 2.3 only supports rhel6. rhel6 tar distribution includes the sparse patch.

            rhenwood Richard Henwood (Inactive) added a comment - No longer relevant: tar with sparse is used to file-level backup MDT. Restoring a MDT from a file-level backup is only supported on 2.3 and beyond. 2.3 only supports rhel6. rhel6 tar distribution includes the sparse patch.

            I haven't seen the beta, but I have had this confirmed by Red Hat - tar will be 1.23+patches for sparse files (among other I assume).

            Once 6.3 is available, then you will have more choices for RHEL6 users: use vanilla Red Hat tar from 6.3, use WC tar, or roll your own gnu tar >1.24.

            WC tar was created specifically to target RHEL5 with the requirement: we want to build on the same platform we run on.

            WC tar achieves this. If I remember correctly: The problem with bumping the gun tar version is that more recent (>1.23) versions, that have sparse and other patches included, require a version of autoconf (>2.60) that is not readily available on RHEL5.

            rhenwood Richard Henwood (Inactive) added a comment - I haven't seen the beta, but I have had this confirmed by Red Hat - tar will be 1.23+patches for sparse files (among other I assume). Once 6.3 is available, then you will have more choices for RHEL6 users: use vanilla Red Hat tar from 6.3, use WC tar, or roll your own gnu tar >1.24. WC tar was created specifically to target RHEL5 with the requirement: we want to build on the same platform we run on. WC tar achieves this. If I remember correctly: The problem with bumping the gun tar version is that more recent (>1.23) versions, that have sparse and other patches included, require a version of autoconf (>2.60) that is not readily available on RHEL5.

            These patches are in RHEL 6.3 beta tar 1.23-7

            1. optimization for -c --sparse for completely sparse files (#760665)
              Patch12: tar-1.23-optimize-packing-entirely-sparse-files.patch
            2. fix for filename corruption when --sparse and --posix options are used. (#656834)
              Patch9: tar-1.23-long-name-corruption.patch
            nrutman Nathan Rutman added a comment - These patches are in RHEL 6.3 beta tar 1.23-7 optimization for -c --sparse for completely sparse files (#760665) Patch12: tar-1.23-optimize-packing-entirely-sparse-files.patch fix for filename corruption when --sparse and --posix options are used. (#656834) Patch9: tar-1.23-long-name-corruption.patch

            Doe this mean WC's tar should be replaced by mainstream tar 1.24?

            nrutman Nathan Rutman added a comment - Doe this mean WC's tar should be replaced by mainstream tar 1.24?

            This is available in the RHEL5 patched lustre-tar, and will be available in RHEL6.3 as well.

            adilger Andreas Dilger added a comment - This is available in the RHEL5 patched lustre-tar, and will be available in RHEL6.3 as well.

            I've been told that RHEL 6.3 will include the completely sparse file optimization patch.

            Users on 6.0, 6.1 and 6.2 will be able to use tar from 6.3 when it is available.

            rhenwood Richard Henwood (Inactive) added a comment - I've been told that RHEL 6.3 will include the completely sparse file optimization patch. Users on 6.0, 6.1 and 6.2 will be able to use tar from 6.3 when it is available.

            It seems that this patch is up-stream in Gnu tar, starting at version: 1.24

            rhenwood Richard Henwood (Inactive) added a comment - It seems that this patch is up-stream in Gnu tar, starting at version: 1.24

            Andreas adds: "This is useful for 1.8.x MDTs right now, and once Fan Yong has implemented OI Scrub it will also be useful for 2.x MDTs."

            rhenwood Richard Henwood (Inactive) added a comment - Andreas adds: "This is useful for 1.8.x MDTs right now, and once Fan Yong has implemented OI Scrub it will also be useful for 2.x MDTs."

            Use Case

            A administrator wishes to perform a file-level back up of a MDT.

            A MDT on a production file system may have many millions of files. Each of these file will be completely sparse (ST_NBLOCKS is zero). Tar without this patch will scan large completely sparse files even though the blocks are zero. Scanning large, completely sparse files is time-consuming.

            rhenwood Richard Henwood (Inactive) added a comment - Use Case A administrator wishes to perform a file-level back up of a MDT. A MDT on a production file system may have many millions of files. Each of these file will be completely sparse (ST_NBLOCKS is zero). Tar without this patch will scan large completely sparse files even though the blocks are zero. Scanning large, completely sparse files is time-consuming.

            The patch looks something like this:

            --- tar-1.19/orig/src/sparse.c
            +++ tar-1.19/src/sparse.c
            @@ -216,15 +216,17 @@
               struct tar_stat_info *st = file->stat_info;
               int fd = file->fd;
               char buffer[BLOCKSIZE];
            -  size_t count;
            +  size_t count = 0;
               off_t offset = 0;
               struct sp_array sp = {0, 0};
             
            -  if (!lseek_or_error (file, 0))
            -    return false;
            -
               st->archive_file_size = 0;
               
            +  if (ST_NBLOCKS (st->stat) == 0)
            +    offset = st->stat.st_size;
            +  else
            +    {
            +
               if (!tar_sparse_scan (file, scan_begin, NULL))
                 return false;
             
            @@ -254,6 +256,7 @@
             
                   offset += count;
                 }
            +  }
             
               if (sp.numbytes == 0)
                 sp.offset = offset;
            
            rhenwood Richard Henwood (Inactive) added a comment - The patch looks something like this: --- tar-1.19/orig/src/sparse.c +++ tar-1.19/src/sparse.c @@ -216,15 +216,17 @@ struct tar_stat_info *st = file->stat_info; int fd = file->fd; char buffer[BLOCKSIZE]; - size_t count; + size_t count = 0; off_t offset = 0; struct sp_array sp = {0, 0}; - if (!lseek_or_error (file, 0)) - return false ; - st->archive_file_size = 0; + if (ST_NBLOCKS (st->stat) == 0) + offset = st->stat.st_size; + else + { + if (!tar_sparse_scan (file, scan_begin, NULL)) return false ; @@ -254,6 +256,7 @@ offset += count; } + } if (sp.numbytes == 0) sp.offset = offset;

            People

              rhenwood Richard Henwood (Inactive)
              rhenwood Richard Henwood (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: