Details

    • New Feature
    • Resolution: Unresolved
    • Major
    • None
    • None
    • None
    • 9223372036854775807

    Description

      Metadata writeback cache is a highly desirable feature that should allow Lustre to overcome link latency for modifying metadata operations esp. noticeable in interactive type of workloads.

      Examples include upacking large archives and building software on a single node with no contention from any other nodes.

      Another highly visible set of usecases would include "File/directory-per-process" kind of benchmarks like say mdtest.

      Attachments

        Issue Links

          1.
          WBC: Add cache aging to flush dirty cache in the background Technical task Open Qian Yingjin
          2.
          WBC: memory limits for caching Technical task Open Qian Yingjin
          3.
          WBC: fsync() support Technical task Open Qian Yingjin
          4.
          WBC: Reopen the file when WBC EX lock revoking Technical task Open WC Triage
          5.
          WBC2: Integrate PCC with Metadata Writeback Caching Technical task Open Qian Yingjin
          6.
          WBC: Rule based auto WBC Technical task Open Qian Yingjin
          7.
          WBC: lfs wbc and lctl wbc utils Technical task Open Qian Yingjin
          8.
          WBC1.5: Cached data reintegration after the client was evicted Technical task Open WC Triage
          9.
          WBC:different flush modes Technical task In Progress Qian Yingjin
          10.
          WBC3: Disconnected operation support Technical task Open Qian Yingjin
          11.
          WBC3: remove the whole subtree on MDT already deleted in the client WBC cache Technical task Open WC Triage
          12.
          WBC2: batch metadata update Technical task Open Qian Yingjin
          13.
          WBC2: lockless IO Technical task Open Qian Yingjin
          14.
          WBC: Basic framework for WBC Technical task In Progress Qian Yingjin
          15.
          WBC: Add symlink support Technical task Open Qian Yingjin
          16.
          WBC: Add hardlink support Technical task Open Qian Yingjin
          17.
          WBC: special readdir() handling for root WBC directory Technical task Open Qian Yingjin
          18.
          WBC: Reclaim mechanism for cached inodes and pages under limits in MemFS Technical task Open Qian Yingjin
          19.
          WBC: inode and space grant mechanism under WBC Technical task Open WC Triage
          20.
          WBC: implement mkdir() by using intent lock Technical task Open Qian Yingjin
          21.
          WBC: better DNE support Technical task Open Qian Yingjin
          22.
          WBC: Stripe directory not flushed to MDT when it grows large enough Technical task Open Qian Yingjin
          23.
          WBC: MemFS lookup fallback to Lustre lookup Technical task Open Qian Yingjin
          24.
          WBC: trigger flush on background when exceed the dirty inode threshold Technical task Open Qian Yingjin
          25.
          WBC: Handle WB_SYNC_NONE properly Technical task Open Qian Yingjin
          26.
          WBC: endless loop in balance_dirty_pages Technical task Open Qian Yingjin
          27.
          WBC2: it should not cause rename() failed when WBC is disabled Technical task Open Qian Yingjin
          28.
          WBC: Write hang with O_DSYNC file flag Technical task Open Qian Yingjin
          29.
          WBC: deadlock for buffered write in parallel when test with dcp/ior Technical task Open Qian Yingjin
          30.
          WBC: Recovery mechanism for batched RPC Technical task Open WC Triage
          31.
          WBC: replay recovery and layout instantiation Technical task Open WC Triage
          32.
          WBC: rename() support for WBC Technical task Open Qian Yingjin
          33.
          WBC: uncache a given file or directory from WBC Technical task Open Qian Yingjin
          34.
          WBC: Handling writeback errors and retry for recoverable errors Technical task Open WC Triage
          35.
          WBC: discard the cached subtrees under WBC when the client is evicted Technical task Open Qian Yingjin

          Activity

            [LU-10938] Metadata writeback cache support

            Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/37104
            Subject: LU-10938 wbc: add symlink support
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 96f7534209e0c7c81b891ea0cadcdb9e9e746237

            gerrit Gerrit Updater added a comment - Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/37104 Subject: LU-10938 wbc: add symlink support Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 96f7534209e0c7c81b891ea0cadcdb9e9e746237

            Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/36851
            Subject: LU-10938 wbc: Basic test scripts for WBC
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 97f122f36e995e251defe8f53a511d614e747ad2

            gerrit Gerrit Updater added a comment - Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/36851 Subject: LU-10938 wbc: Basic test scripts for WBC Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 97f122f36e995e251defe8f53a511d614e747ad2

            I would strongly prefer not merging them.  Having them broken out like this is really nice.

            pfarrell Patrick Farrell (Inactive) added a comment - I would strongly prefer not merging them.  Having them broken out like this is really nice.
            qian_wc Qian Yingjin added a comment -

            I will try to rebase all WBC patches against master branch.
            Should I merge the series of WBC patches into a big one patch?

            qian_wc Qian Yingjin added a comment - I will try to rebase all WBC patches against master branch. Should I merge the series of WBC patches into a big one patch?
            qian_wc Qian Yingjin added a comment -

            After reading the code, I found that current WBC still need to solve the following problem of reopen the file:

            1. When WBC is valid, open the file;
            2. the EX layout was revoked, the file is flushed
            3. at this time, we still need to reopen the file on MDS.
            qian_wc Qian Yingjin added a comment - After reading the code, I found that current WBC still need to solve the following problem of reopen the file: When WBC is valid, open the file; the EX layout was revoked, the file is flushed at this time, we still need to reopen the file on MDS.

            Status of the prototype, per ISC

            • No cache/quota/space limits, no background flushing, no batching, ...
            • Early tests show 10-20x single-client speedup tests (untar, make, …)
            nrutman Nathan Rutman added a comment - Status of the prototype, per ISC No cache/quota/space limits, no background flushing, no batching, ... Early tests show 10-20x single-client speedup tests (untar, make, …)

            The timestamps belong to the target that has the most recent ctime, but in any case since this optimization would be useful mostly for DoM files, the MDS would own all of the attributes including the size, as long as the later components were not initialized. Even if the later components were initialized, setting the timestamps, UID, GID, permission, etc. could be cached on the client as long as it has the DoM write lock on the MDT inode, since no other clients could modify/access these attributes.

            adilger Andreas Dilger added a comment - The timestamps belong to the target that has the most recent ctime, but in any case since this optimization would be useful mostly for DoM files, the MDS would own all of the attributes including the size, as long as the later components were not initialized. Even if the later components were initialized, setting the timestamps, UID, GID, permission, etc. could be cached on the client as long as it has the DoM write lock on the MDT inode, since no other clients could modify/access these attributes.

            I should perhaps know the answer to this question, but which Lustre attributes are exclusively on the MDS?  Permissions are, xattrs are, and size is not (even in DOM, it's pulled from the data object - it's just that object is on the MDS) but what about the various time properties?

            The issue that will come up, I think, is which bits the client has a write lock on.

            pfarrell Patrick Farrell (Inactive) added a comment - I should perhaps know the answer to this question, but which Lustre attributes are exclusively on the MDS?  Permissions are, xattrs are, and size is not (even in DOM, it's pulled from the data object - it's just that object is on the MDS) but what about the various time properties? The issue that will come up, I think, is which bits the client has a write lock on.

            One thought I had was whether it is possible to cache/merge setattr-like requests like utimensat(), fchmod(), fchown(), fchgrp(), etc. on the client when creating/writing DoM regular files on the client? The client should already have a write lock on the file for DoM write, and the MDS shouldn't care whether the time, mode, permission, and maybe UID/GID changes before the file is written to the MDS because these attributes should all be sent with the write request.

            adilger Andreas Dilger added a comment - One thought I had was whether it is possible to cache/merge setattr-like requests like utimensat() , fchmod() , fchown() , fchgrp() , etc. on the client when creating/writing DoM regular files on the client? The client should already have a write lock on the file for DoM write, and the MDS shouldn't care whether the time, mode, permission, and maybe UID/GID changes before the file is written to the MDS because these attributes should all be sent with the write request.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/32241/
            Subject: LU-10938 ptlrpc: Add WBC connect flag
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: f024aabf8bbf797e101eec5ff054f7a6e3d253ed

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/32241/ Subject: LU-10938 ptlrpc: Add WBC connect flag Project: fs/lustre-release Branch: master Current Patch Set: Commit: f024aabf8bbf797e101eec5ff054f7a6e3d253ed

            People

              qian_wc Qian Yingjin
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              22 Start watching this issue

              Dates

                Created:
                Updated: