Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16680

add lowmem sync feature

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 9223372036854775807

    Description

      Currently, when the system is low on memory, it will start asking Lustre to release pages and force them out of the page cache.  However, when we do a buffered write, we pin the pages until they are committed on the OST.  There's currently no way for the kernel to get us to do anything about those pages - they just sit there, taking up memory, until the OST does a commit and the client gets an RPC updating with last_committed.  This can take several seconds, which means if the memory limit is low and/or write speed is high, will result in tasks doing IO getting OOM killed for failing to free memory.

      We've tried to solve this in the past by integrating NFS unstable pages tracking in to Lustre, but this is fraught - it treats our uncommitted pages as dirty, which means we get rate limited on them.  The kernels idea of an appropriate number of outstanding pages is based on local file systems, and isn't enough for us, so this causes performance issues.  The SOFT_SYNC feature we created to work with unstable pages also just asks the OST nicely to do a commit, and includes no way for the client to be notified quickly.

      This means it can't be responsive enough to avoid tasks getting OOM-killed.

      This ticket is to track a patch using a simpler approach:
      When we are doing IO and we detect memory pressure, force the client to do a sync RPC.  This both pauses client IO while we're in severe memory pressure - the user process stops accumulating new uncommitted pages while waiting for the sync RPC - and waiting for that RPC guarantees we will get last_committed updated before we start adding new dirty data.

      Attachments

        Issue Links

          Activity

            People

              paf0186 Patrick Farrell (Inactive)
              paf0186 Patrick Farrell (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: