[LU-13482] add verbose stats to lfs_migrate and "lfs migrate" Created: 23/Apr/20  Updated: 30/Jun/23  Resolved: 03/Feb/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Improvement Priority: Major
Reporter: Andreas Dilger Assignee: Tim Day
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-14212 DNE3: directory migration progress mo... Open
is related to LU-16632 sanity test_56xh: 'lfs migrate -W' to... Resolved
is related to LU-16499 merge "lfs_migrate" functionality int... Open
Rank (Obsolete): 9223372036854775807
Epic Link: OST rebalance v1

 Description   

It would be useful to add progress stats to "lfs_migrate" and "lfs migrate" with an option like "--verbose=stats" or "--stats" or similar. The time interval might be specified (e.g. "--stats-interval=N", but should default to once per 5s or so). This option should be passed through from "lfs_migrate" to "lfs migrate".

It is not possible/practical to show a "percent completion" bar for "lfs_migrate", since the number of files and their size is not practical to know in advance, but it should definitely be possible for "lfs migrate" to show the percent completion metric for migrating individual files, if they are taking a longer time to complete (not for every file if it is small, unless --verbose is used and the filename would be shown anyway).

For "lfs migrate" it can show something like the following in YAML format, so that it is also possible to parse this output for consumption by other tools:

filename:
- { seconds: <seconds>, rmbps: <MiB/s>, wmbps: <MiB/s>, copied: <MiB>, size: <MiB>, pct: <done>% }

Whether the status update line ends in a newline "\n" for logging or piped to another program, or only a carriage return "\r" (that overprints the previous line for more user-friendly output) can be determined by calling isatty. The seconds: field is the total elapsed time. The rmbps: (bytes read over read time) and wmbps: (bytes read over write time) values would be a decaying average data transfer rate across files if a single invocation is migrating multiple files, since small files would barely have time to print anything before they are completed, so the instantaneous transfer rates may not be very useful. copied: is the total number of bytes written for the current file, and size: is the total file size, pct: = copied/size.

For "lfs_migrate" what can be shown is the elapsed time and current decaying average values for total volume/rate of data migrated, number/rate of files migrated. Since this output will be intermingled with the "lfs migrate" output for each file, it makes sense to format it in a similar YAML style so that the full log is still readable, but it can be easily distinguished between what is global progress and what is per-file progress. The lfs_migrate script can use test -t 0 to determine if it is writing to a terminal or a log file, and whether the linefeed \n or carriage return \r is used at the end of line:

directory:
- { tot_sec: <seconds>, tot_files: <count>, avg_fps: <files_per_sec>, avg_rmbps: <MiB/s>, avg_wmbps: <MiB/s>, tot_size: <MiB> }

tot_sec: is the total elapsed time for the current migration, tot_files the total number of files (all entries) migrated so far, and avg_fps = tot_files/tot_sec for all files migrated so far. The totals line would only be shown every stats_interval seconds.



 Comments   
Comment by Andreas Dilger [ 12/Jul/20 ]

It would also make sense for lfs_migrate to include the migration rate of files per second if the files are small and the bandwidth computed for the transfer is not useful.

Comment by Gerrit Updater [ 12/Jan/23 ]

"Timothy Day <timday@amazon.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49620
Subject: LU-13482 utils: Bandwidth limit for lfs migrate
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a80f033f39f62dfd943912c665ef928604019f13

Comment by Gerrit Updater [ 03/Feb/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49620/
Subject: LU-13482 utils: bandwidth limit for lfs migrate
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 55968bfabe351ad37ee810bf69748ffa56d28037

Comment by Peter Jones [ 03/Feb/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:01:38 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.