Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9846

Overstriping - more than stripe per OST per component



    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • Lustre 2.13.0
    • None
    • 9223372036854775807


      'Overstriping' is my term for allowing more than one stripe of a particular file component to be placed on a particular OST.

      Lock ahead is designed to address the case were we are limited to a single shared file, but each OST is significantly faster than one client. In a shared file situation, LDLM locking behavior limits us to writing with one client to each OST, so we are unable to fully drive each OST for the shared file. This is an interesting enough case for Cray to drive all the work in lock ahead use a library to achieve it.

      If we can put multiple stripes of the file on a single OST, we can essentially achieve the same thing, with far less effort. For a variety of reasons, this doesn't remove the need for lockahead (Primarily because we cannot necessarily redefine file striping when we want to write to it), but it is much simpler, and highly desirable for that reason. In addition to the MPIIO aggregation case where we have well controlled I/O and are trying to maximize OST utilization, adding more stripes to a shared file also helps in cases where I/O is poorly controlled, so there are effectively more locks for the badly behaved writers to contend for.

      So, in short, I think it would be very, very desirable if, in a controlled manner, we could ask for more than one stripe to be on a given OST. A simple example is something like "8 stripes but only on these 2 OSTs", giving 4 stripes per OST (and allowing 4 client writers per OST with no fancy locking work).

      A note about PFL:
      While pfl lets you create separate components using the same OST, that's not a viable solution for the case I'm looking for, which is just > 1 stripe on a given OST. The idea being to write to all stripes in parallel, but wanting > 1 writer to be pointed at each OST. PFL would require a neverending series of components to do that.

      Patch to implement this follows.


        Issue Links



              pfarrell Patrick Farrell (Inactive)
              paf Patrick Farrell (Inactive)
              0 Vote for this issue
              8 Start watching this issue