Details
-
Task
-
Resolution: Fixed
-
Blocker
-
None
-
None
-
None
-
9223372036854775807
Description
Many stripe count test
The many stripe count functional test is intended to show that a DNE2 configuration can handle many MDTs in a single filesystem, and a single directory can be striped over many MDTs. Due to the virtual AWS environment in which this is being tested, while performance will be measured, neither performance scaling nor load testing are primary goals of this test. It is rather a functional scaling test of the ability of the filesystem configuration and directory striping code to handle a large number of MDTs.
- Create a filesystem with 128 MDTs, 128 OSTs and at least 128 client mount points (multiple mounts per client)
- Create striped directories with stripe count N in 16, 32, 64, 96, 128:
lfs setdirstripe -c N /mnt/lustre/testN
Note: This command creates a striped directory across N MDTs.
lfs setdirstripe -D -c N /mnt/lustre/testN
Note: This command sets the default stripe count to N. All directories created within this directory will have this default stripe count applied.
- Run mdtest on all client mount points, and each thread will create/stat/unlink at least 128k files in the striped test directory. Run this test under a striped directory with default stripes, so all of subdirectories will be striped directory.
lfs setdirstripe -c N /mnt/lustre/testN lfs setdirstripe -D -c N /mnt/lustre/testN
- No errors will be observed, and balanced striping of files across MDTs will be observed.
As I already have some automation built around mdsrate, I'll use that and ensure all threads use a single, shared directory. I also have a patch to mdsrate that adds support for directory striping, though I'll use the lfs commands to make it explicit.
I'll add an `lfs df -i` before and after the create step of each test so we can confirm, at least manually, that the files are balanced.
Do you want the files created with 0-stripes or normally with a single stripe?
BTW, I've been putting our provisioning tools through the paces today, but thought I'd try one client with 128k files. I noticed that large DNE striping has a pretty big impact on directory scanning performance:
I noticed this because "lfs getdirstripe" was taking ~40s to return because for some reason lfs reads all the directory entries after printing the striping data. I'll make sure to only do this on empty directories for now.