Details

Type: Question/Request
Resolution: Unresolved
Priority: Minor
Fix Version/s: None
Affects Version/s: Lustre 2.14.0
Labels:
- llnl
Environment:
client:
toss 3.7-14.1
3.10.0-1160.45.1.1chaos.ch6.x86_64
lustre 2.12.7_2.llnl

server:
toss 4.1-5
4.18.0-240.22.1.1toss.t4.x86_64
zfs 2.0.52_2llnl-1
lustre 2.14.0_5.llnl

Epic/Theme:
- Performance
Rank (Obsolete):
9223372036854775807

Description

lfs-migrate Metadata Performance Testing

While trying to use lfs-migrate for meta-data migration, we found that lfs-migrate perfomance does not scale well with additional processes. Even when using many processes and nodes, sustained performance was around 400 items/second, which is too slow to be practical for migrations of large numbers of files and directories.

This testing plan is for performing additional tests to see if the above results are in fact the limit, or near the limit of lfs-migrate's performance.

Overview

The performance to be measured is the rate at which items (files and directories) can be migrated. These items will be in a tree (or trees) and migrated by many processes running lfs-migrate in parallel.

The 3 basic parts of the test are:

create the trees
migrate the trees
analyze the data generated during the migration

Create the Trees

A single tree can be created using mdtest. mdtest has the ability to make trees of files and directories, and can parameterize those trees in most of the ways necessary for this test.

The major shortcoming with mdtest is that it doesn't set the striping and directory striping of the trees it creates. This can be overcome by pre-creating directories, setting their striping and directory striping, and then having mdtest create trees within these directories so that each tree inherits these setting from its respective parent directory.

The command to create the trees needs to be saved. This includes both the mdtest command per directory, and also the command to make the directories and set their striping and directory striping. Also, mdtest will be run with srun, so the whole srun command needs to be saved because the srun parameters will affect the size/shape of the tree.

Migrate the Trees

The migration is done in parallel by many processes, each running lfs-migrate on one of the directories that contains a tree created by mdtest. The many processes are created and spread across multiple client nodes using srun.

Data needs to be collected during the run. Process 0 will record run-wide data such as total items migrated, and each process will write its own performance data. This will generate 1 file per processes, and 1 more for the run-wide data. Some of the collected data could be inferred from other data (or from the slurm database) but recording it simplifies post-processing.

Data to Collect Per Run

total items migrated
total data migrated
the mdtest command and the striping/dirstriping commands
slurm jobid
the srun command that does the migration

Data to Collect Per Process

start time (of lfs-migrate)
end time (of lfs-migrate)
source MDTs
destination MDTs
the lfs-migrate command
lfs getdirstripe output for the root of the tree the process will migrate

Potential Parameters to Vary between Runs

total number of processes, nodes*ppn

- the number of processes per node (2,8,16)
- the number of nodes (1,8,32)
the kind of items that are migrated (files,directories)
how many items per process are migrated (1K, 8K, 64K configured with mdtest command)
file size = 0, fixed
DoM or not DoM

Initial Runs Planned

Note that the above is still probably a larger parameter space than is necessary to find first-order bottlenecks (3*3*3*3*1*2 == 162 tests). To reduce the amount of tests, and expected total run time, initially only the following tests will be run. More complete testing of the parameter space will be performed as needed after developers are engaged.

Find the values of nodes and ppn that maximize overall lfs-migrate rate for files only, 8K per process, without DoM (9 tests)
Using those values for nodes and ppn, test for the above items per process. Record the value of items/process (ipp) that maximizes overall lfs-migrate rate for files only, without DoM (3 tests)
Using those values for nodes, ppn, and ipp, test with files with DoM and files without DoM (2 tests)

Data Analysis

The data recorded for each run will all go into a single directory, along with the trees(s) creation data. A script will read the meta-data and per-process performance data, and calculate the rate at which items are migrated. The important input parameters and corresponding results for all runs will be output as a csv.

Performance Comparison

For comparison, other performance metrics with the same file system and clients will be gathered:

mdtest will be run with the same node and ppn combinations and enough objects per process to make each mdtest stage (e.g. create, unlink, etc.) take at least 10 minutes.

Attachments

Issue Links

is related to

LU-14975 DNE3: directory migration in non-recursive mode

Resolved

lfs migrate metadata performance test plan