[LU-16587] Make lfs migrate faster Created: 22/Feb/23 Updated: 02/Aug/23 Resolved: 03/Apr/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Nathan Rutman | Assignee: | Nathan Rutman |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Most of the time lfs migrate uses a 1MB buffer (stripe size) to copy data. This is terribly slow. In my testing I see improved performance up to 64M buffers. [root@kjlmo4n00 16G]# lfs getstripe 16G.1 16G.1 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 11 lmm_stripe_offset: 1 lmm_pool: flash obdidx objid objid group 1 394422 0x604b6 0 [root@kjlmo4n00 16G]# time lfs migrate -S 1M -p flash 16G.1 real 0m25.341s user 0m0.001s sys 0m2.606s [root@kjlmo4n00 16G]# time /root/tools/lfs_nzr migrate -S 1M -p flash 16G.1 real 0m6.526s user 0m0.000s sys 0m2.177s You can force lfs to use a bigger buffer by increasing stripe size and you can see similar improvements. |
| Comments |
| Comment by Nathan Rutman [ 22/Feb/23 ] |
|
Sigh. Permission denied trying to upload to Gerritt; stashing patch here for now. diff --git a/lustre/utils/lfs.c b/lustre/utils/lfs.c index 999e6357a1..c2aa48d071 100644 --- a/lustre/utils/lfs.c +++ b/lustre/utils/lfs.c @@ -832,7 +832,7 @@ static int migrate_copy_data(int fd_src, int fd_dst, int (*check_file)(int), long stats_interval_sec, off_t file_size_bytes) { struct llapi_layout *layout; - size_t buf_size = 4 * 1024 * 1024; + size_t buf_size = 64 * 1024 * 1024; void *buf = NULL; off_t pos = 0; off_t data_end = 0; @@ -850,8 +850,14 @@ static int migrate_copy_data(int fd_src, int fd_dst, int (*check_file)(int), uint64_t stripe_size; rc = llapi_layout_stripe_size_get(layout, &stripe_size); - if (rc == 0) - buf_size = stripe_size; + if (rc == 0) { + /* We like big bufs */ + if (stripe_size > buf_size) + buf_size = stripe_size; + else + /* Trim to stripe_size multiple */ + buf_size -= buf_size % stripe_size; + } llapi_layout_free(layout); } |
| Comment by Patrick Farrell [ 22/Feb/23 ] |
|
Nathan, Share your Gerrit error message here and we might be able to help? I sort of thought we'd already done this, when we switched it to using direct I/O... Oops. DIO won't help at all if the buffer size is still small... |
| Comment by Gerrit Updater [ 22/Feb/23 ] |
|
"Nathan Rutman <nrutman@gmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50118 |
| Comment by Andreas Dilger [ 23/Feb/23 ] |
|
Patrick, do we default to O_DIRECT now that your optimizations are landed? Presumably that would only get faster over time, and be avoid thrashing the client page cache when migrating a lot of files. |
| Comment by Patrick Farrell [ 23/Feb/23 ] |
|
Well, now you've made me check... Yes - We default to direct I/O and let you turn it off with 'D' for non-direct mode. And now that Nathan is increasing the buffer size, it will actually be faster. Oops. |
| Comment by Gerrit Updater [ 21/Mar/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50118/ |
| Comment by Peter Jones [ 22/Mar/23 ] |
|
Landed for 2.16 |
| Comment by Gerrit Updater [ 24/May/23 ] |
|
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51115 |
| Comment by Gerrit Updater [ 24/May/23 ] |
|
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51116 |