Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13643

FLR3: Immediate file write mirroring

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Rank (Obsolete):
      9223372036854775807

      Description

      FLR currently implements delayed file write mirroring, where the initial write is done to a single mirror, and an external tool eventually synchronizes the data to the other mirror(s) of the file. If the file is ever modified, then the mirror component(s) other than the one modified is marked stale, and needs to be synchronized by the external tool. This mechanism saves bandwidth from the clients, and still provides data availability unless the file is lost immediately after it is written.

      However, if the file is modified afterward, resyncing the mirrors of the modified component may be cause a large amount of write amplification, or potentially prevent the stale mirrors from being resync'd if it is continuously being modified. It would be preferable to implement immediate file write mirroring, so that the client can submit the same page to multiple RPCs to different OST objects and keep them both updated concurrently.

      Immediate file write (IFW) mirroring may not be desired for all applications, so it should have an LCME_FL_IMMEDIATE flag stored in the component(s) indicating the clients should keep both copies uptodate. Having per-component flags will allow configurations where e.g. two flash mirrors are immediately written, but a third/fourth disk mirror could use delayed resync for emergency recovery and/or cold storage (e.g. if the flash mirrors are only short term hot copies of the file).

      The IFW client will need to notify the MDS whether it can keep the mirrors in sync, otherwise it needs to maintain the current behavior of marking all but one mirror LCME_FL_STALE when the client writes to the file. At the coarsest grain, this means the client needs an OBD_CONNECT2_IMMEDIATE connection-time flag, but it may be able to indicate its intent on a per-file level (e.g. with the initial write intent) so that it can do this on a case-by-case basis (e.g. remote clients should probably not do double writes).

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              wc-triage WC Triage
              Reporter:
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated: