Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17503

IO500: improve NRS TBF to sort requests by object offset for ior-hard-write

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.14.0, Lustre 2.17.0
    • 3
    • 9223372036854775807

    Description

      In the IO500 benchmark, the "ior-hard-write" phase simulates many threads writing to a single large file (e.g. writing out regions of a very large array from memory), with a stonewall timer, after which all threads must continue to write until each thread has written the same amount of data as the farthest write offset from any thread.

      In the current implementation, some "early mover" jobs have a large advantage to write to the file because they are granted DLM locks for non-conflicting regions of the file, and get far ahead of other writers that must contend for the DLM locks. This causes the "IOR hard write" phase to take a long time due to a "long tail" where threads need to "fill in" the large gaps in the file. Having the NRS TBF request handler sort the RPCs by file offset (in addition to arrival time) and prioritize writes with smaller offsets over writes with higher offsets would slow down the faster writers and speed up the slower ones, until they are in lockstep. Having the writes processed sequentially is also beneficial for managing the server cache and IO request merging for submission to the underlying filesystem, so should result in improved aggregate performance even though some threads are deliberately slowed down.

      The NRS ORR engine exists to do request ordering within an object, but having a single NRS TBF policy is preferred, since ORR is missing much of the functionality of TBF, and doing tiered request sorting is unlikely to produce an optimal result.

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: