Details

    • New Feature
    • Resolution: Fixed
    • Minor
    • Lustre 2.4.0
    • Lustre 2.4.0
    • HSM
    • 4118

    Description

      Add the possibility to swap the layouts between to files.
      This feature will be used to migrate files inside Lustre (eg for rebalance) and by HSM to restore files from archive
      This patch is the mdd part of the feature, the high level design is:
      – Start transaction
      – Read LOV EA from obj1 & obj 2
      – Run xattr_set(obj1, layout2) & xattr_set(obj2, layout1)
      (LOD should not take any action)
      – Stop transaction

      Attachments

        Issue Links

          Activity

            [LU-2016] Layout swapping, MDD part

            Change 4189 is landed, can this be closed?

            adilger Andreas Dilger added a comment - Change 4189 is landed, can this be closed?

            I am implementing lfs swap_layouts in LU-2017, I should push a first version next week.
            I have made a simple test without concurrent access, I will add the concurrent access test.

            About the open, today I open with O_WRONLY in place of O_RDONLY because swaping is changing file content.
            With O_RDONLY someone can swap file it should not be able to modify.

            jcl jacques-charles lafoucriere added a comment - I am implementing lfs swap_layouts in LU-2017 , I should push a first version next week. I have made a simple test without concurrent access, I will add the concurrent access test. About the open, today I open with O_WRONLY in place of O_RDONLY because swaping is changing file content. With O_RDONLY someone can swap file it should not be able to modify.

            I was thinking that a good sanityt test for the layout swap and layout lock code would be to implement an "lfs swap" command and hook this into the "lfs_migrate" script instead of using "mv". The core "lfs swap"/"lfs_swap" command functionality should be implemented via new llapi helpers:

            int llapi_fd_layout_swap(int src_fd, int tgt_fd, __u64 flags)
            {
                    struct ll_layout_swap lls = { .lls_tgt = tgt_fd, .lls_flags = flags };
            
                    return ioctl(fd, LL_IOC_LOV_LAYOUT_SWAP, (void *)&lls)
            }
            
            int llapi_file_layout_swap(const char *src_path, const char *tgt_path, __u64 flags)
            {
                    int src_fd, tgt_fd;
            
                    src_fd = open(src_path, O_RDONLY);
                    tgt_fd = open(tgt_path, O_RDONLY);
                    /* error handling and such */
            
                    return llapi_fd_layout_swap(src_fd, tgt_fd, flags);
            }
            

            Where ll_file_ioctl(LL_IOC_LOV_LAYOUT_SWAP) triggers an MDS RPC to call mdd_swap_layouts().

            "lfs swap" wouldn't be 100% atomic yet, since the copy + swap would not yet be safe from other thread modifying the original object during the copy, but it would at least fix the problem of preserving open file handles on the original file and would be a good step toward implementing a full atomic "lfs migrate".

            A test script for this code could look like:

            1. generate 10s of verifiable IO data
              dd if=/dev/urandom of=$DIR/$tfile.orig bs=1M &
              DD_PID=$?
              sleep 10
              kill $DD_PID
            1. copy the data, while swapping the layout repeatedly and
            2. changing the stripe count to exercise the IO/LOV code
              dd if=$DIR/$tfile.orig of=$DIR/$tfile.copy bs=1M &
              DD_PID=$?
              sleep 0.5
              while kill -STOP $DD_PID; do
              lfs_migrate -y -c $((RANDOM % OSTCOUNT)) $DIR/$tfile.copy
              kill -CONT $DD_PID
              sleep 1
              done
            1. verify that the migrated file matches the original
            2. lfs_migrate itself verifies each copy after it is made
              cmp $DIR/$tfile.orig $DIR/$tfile.copy || error "compare failed"

            At some later point (or immediately, if someone is ambitious) lfs_migrate would be implemented with "lfs migrate", which calls lfs_setstripe() internally to process the arguments to create the target object(s), then gets the group lock and does the data copy internally before calling llapi_fd_layout_swap() to do the swap (like an HSM copytool does). At this point, the above "kill -STOP" can be replaced with "kill -0" (just to check that "dd" is still running) and should be able to migrate the file while it is being written.

            adilger Andreas Dilger added a comment - I was thinking that a good sanityt test for the layout swap and layout lock code would be to implement an "lfs swap" command and hook this into the "lfs_migrate" script instead of using "mv". The core "lfs swap"/"lfs_swap" command functionality should be implemented via new llapi helpers: int llapi_fd_layout_swap( int src_fd, int tgt_fd, __u64 flags) { struct ll_layout_swap lls = { .lls_tgt = tgt_fd, .lls_flags = flags }; return ioctl(fd, LL_IOC_LOV_LAYOUT_SWAP, (void *)&lls) } int llapi_file_layout_swap( const char *src_path, const char *tgt_path, __u64 flags) { int src_fd, tgt_fd; src_fd = open(src_path, O_RDONLY); tgt_fd = open(tgt_path, O_RDONLY); /* error handling and such */ return llapi_fd_layout_swap(src_fd, tgt_fd, flags); } Where ll_file_ioctl(LL_IOC_LOV_LAYOUT_SWAP) triggers an MDS RPC to call mdd_swap_layouts(). "lfs swap" wouldn't be 100% atomic yet, since the copy + swap would not yet be safe from other thread modifying the original object during the copy, but it would at least fix the problem of preserving open file handles on the original file and would be a good step toward implementing a full atomic "lfs migrate". A test script for this code could look like: generate 10s of verifiable IO data dd if=/dev/urandom of=$DIR/$tfile.orig bs=1M & DD_PID=$? sleep 10 kill $DD_PID copy the data, while swapping the layout repeatedly and changing the stripe count to exercise the IO/LOV code dd if=$DIR/$tfile.orig of=$DIR/$tfile.copy bs=1M & DD_PID=$? sleep 0.5 while kill -STOP $DD_PID; do lfs_migrate -y -c $((RANDOM % OSTCOUNT)) $DIR/$tfile.copy kill -CONT $DD_PID sleep 1 done verify that the migrated file matches the original lfs_migrate itself verifies each copy after it is made cmp $DIR/$tfile.orig $DIR/$tfile.copy || error "compare failed" At some later point (or immediately, if someone is ambitious) lfs_migrate would be implemented with "lfs migrate", which calls lfs_setstripe() internally to process the arguments to create the target object(s), then gets the group lock and does the data copy internally before calling llapi_fd_layout_swap() to do the swap (like an HSM copytool does). At this point, the above "kill -STOP" can be replaced with "kill -0" (just to check that "dd" is still running) and should be able to migrate the file while it is being written.
            jcl jacques-charles lafoucriere added a comment - Patch is at http://review.whamcloud.com/4189
            pjones Peter Jones added a comment -

            Jodi will see that this ticket is appropriately assigned

            pjones Peter Jones added a comment - Jodi will see that this ticket is appropriately assigned

            People

              jlevi Jodi Levi (Inactive)
              cealustre CEA
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: