During the dedicated system test, there are several full-system and sub-system
tests that we can run to help identify any potential metadata performance
issues in the system as a whole, or in any particular subsystem.

As far as full-system performance, two tests are likely to be useful:
1) A scale up mdtest were we run X threads on Y clients, where X and Y are
incremented in order to generate a performance curve
2) Run many separate (not mpi-coordinated) mdtests on the system to see if
there are issues with combinations of different md ops causing more contention
on one FS

#1 is the more important test. #2, due to its random nature, could be
difficult to really analyze.

For the sub-system tests, three tests should identify where along the network,
lock manager, ldiskfs, disk chain the delays are coming.
1) network -> lnet_selftest between client and MDT, both a bandwidth and
message rate test
2) locking/ldiskfs -> run mdtest directly on MDT ldiskfs filesystem to remove
the Lustre layer
3) ldiskfs/disk -> create a new logical volume on the volume group and run xdd
iop tests.

I think #2 is probably where to start and then either move up or down the
chain, depending on results.

As far as stats collection, we should capture debug logs from the client and
MDT during a single-client mdtest. These can be analyzed later if necessary.
During the mdtest runs, we can take timestamped snapshots of the MDT stats
proc files, as well as the disk and LVM stats. If there is a particularly
interesting run, we can then look at those snapshots to see if there are any
interesting patterns. Capturing cpu load and memory usage would also be good.