Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17776

parallel VFS directory operation locking on client

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.16.0
    • 3
    • 9223372036854775807

    Description

      The lack of parallel operations in a single directory from one client is becoming a limiting factor in some workloads, as the number of CPU cores on clients are growing into the hundreds (e.g. DGX H100 have 256 cores today). The IO500 mdtest-hard-write test exercises exactly this code (multiple threads updating a single shared directory), both on a single client as well as multiple clients. This benchmark is sometimes run with multiple mountpoints on a single client in order to work around the kernel VFS locking limitations, however this impacts any multi-threaded workload that is operating in a single directory. It would be best (both for the benchmark and real applications) to fix the VFS locking properly in the kernel.

      There was an RFC patch from neilb a couple of years ago (VFS: support parallel updates in the one directory) that added a prototype VFS parallel directory lock for NFS clients, and after minor revisions that patch was showing significant improvements to performance (400x) just due to concurrency of requests over a high-latency network, even though the NFS server itself could not do parallel operations. Unfortunately that patch was never landed after the initial positive RFC.

      The MDS and ldiskfs can already handle parallel locking on a single directory from multiple different clients, but the kernel VFS directory locking does not currently allow more than one thread to be modifying the directory contents in any way (create, unlink, rename). This change should provide fairly substantial speedups to multi-threaded workloads on a single directory (up to the concurrency limit of the threads on the client and MDS, which could potentially be up to 100x faster).

      It would be very useful to update Neil's patch for the latest kernels and then make the (hopefully minor) changes to the lustre/llite code under "#ifdef DCACHE_PAR_UPDATE" that would be needed to work with a kernel with this change, whether that is from patching the client kernel (which we haven't done in a long time) or because the patch is landed upstream.

      Attachments

        Issue Links

          Activity

            [LU-17776] parallel VFS directory operation locking on client

            Status update: a more recent patch series by neilb was submitted to the linux-fsdevel mailing list:
            [PATCH 00/19 v7?] RFC: Allow concurrent and async changes in a directory

            This is making some forward progress with patches to clean up corners of the VFS that have anomalies in locking and other dcache behavior, but the core VFS parts of the patch series that change the locking are still a ways from landing, AFAICS.

            It would be interesting to see what kind of single-client performance improvement was available with this patch series and a multi-MDT Lustre filesystem. It is common to have multiple Lustre mountpoints on a single client (e.g. one per container) to avoid VFS locking bottlenecks for many threads doing concurrent operations in a single directory.

            adilger Andreas Dilger added a comment - Status update: a more recent patch series by neilb was submitted to the linux-fsdevel mailing list: [PATCH 00/19 v7?] RFC: Allow concurrent and async changes in a directory This is making some forward progress with patches to clean up corners of the VFS that have anomalies in locking and other dcache behavior, but the core VFS parts of the patch series that change the locking are still a ways from landing, AFAICS. It would be interesting to see what kind of single-client performance improvement was available with this patch series and a multi-MDT Lustre filesystem. It is common to have multiple Lustre mountpoints on a single client (e.g. one per container) to avoid VFS locking bottlenecks for many threads doing concurrent operations in a single directory.

            People

              wc-triage WC Triage
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: