Details
-
Task
-
Resolution: Fixed
-
Blocker
-
None
-
None
-
None
-
9223372036854775807
Description
Many stripe count test
The many stripe count functional test is intended to show that a DNE2 configuration can handle many MDTs in a single filesystem, and a single directory can be striped over many MDTs. Due to the virtual AWS environment in which this is being tested, while performance will be measured, neither performance scaling nor load testing are primary goals of this test. It is rather a functional scaling test of the ability of the filesystem configuration and directory striping code to handle a large number of MDTs.
- Create a filesystem with 128 MDTs, 128 OSTs and at least 128 client mount points (multiple mounts per client)
- Create striped directories with stripe count N in 16, 32, 64, 96, 128:
lfs setdirstripe -c N /mnt/lustre/testN
Note: This command creates a striped directory across N MDTs.
lfs setdirstripe -D -c N /mnt/lustre/testN
Note: This command sets the default stripe count to N. All directories created within this directory will have this default stripe count applied.
- Run mdtest on all client mount points, and each thread will create/stat/unlink at least 128k files in the striped test directory. Run this test under a striped directory with default stripes, so all of subdirectories will be striped directory.
lfs setdirstripe -c N /mnt/lustre/testN lfs setdirstripe -D -c N /mnt/lustre/testN
- No errors will be observed, and balanced striping of files across MDTs will be observed.
Although this was not intended to be a performance test, I did notice that the stripe allocation policy for striped directories appears to be simplistic. As you can see, it appears to always allocate N sequential targets starting from MDT0. This means usage of MDTs will be very uneven unless all directories are widely striped.
CE is designed to provision targets sequentially on each node, and with during striped directory allocation scheme this results in the initial 16 MDT striped directory using a single MDS, rather than using all of them. In the interest of saving time, I changed the target allocation scheme specifically for this test so targets were staggered across the servers, and this balance IO across all MDS instances for all test runs.