Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18457

Flashback for Lustre

    XMLWordPrintable

Details

    • New Feature
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Combined with the recycle bin feature and the extended Lustre changelog, it can achieve the flashback feature for Lustre just like the flashback in ORACLE database. With the flashback feature, a user can rewind the metadata of the whole file system to a target time, SCN or restore point.

      Flashback undoes changes made by users. It can fix logical failures, but not physical failures. As a result, a user cannot use the flashback command to recover from disk failures, but can recover from the accidental deletion of data files or directories combined with the recycle bin feature.

      The MDT undoes the changes according to the changelog in the reverse order. In the following, it gives out some examples for metadata operation undoing:

      CL_MKDIR lname => rmdir lname

      CL_CREATE lname => unlink lname

      CL_UNLINK lname => last unlink? create lname : link lname

      CL_RMDIR lname => mkdir lname

      ...

      Flashback can even keep the original FIDs for undoing files after the flashback recovery.

      To full support flashback for Lustre, it needs to solve the following problems.

      First, the Lustre changelog is needed to be improved with some extensions. For some metadata update operations such as setattr()/setxattr(), It needs to store the old values (attributes or old XATTR value) into the changelog. Thus the MDT can restore with the old value during the flashback operation.

      Second, It needs to support flashback for cross-MDT metadatta operations and handle the recovery order carefully.

      Each MDT has a monotonically increasing sequence number: transno. Each metadata update operation will get a unique transno. It can be used as an System Change Number (SCN) to ensure the transcation partial order. Once a MDT received a request/reply piggybacked with SCN from another MDT, it will update the local SCN with bigger one.

      A user can create a restore point by synchronizing SCNs on all MDTs and write a synced SCN into the changelog.

      Changelog are extended to store SCN in the changelog record. Once a operation is a cross-MDT metadata update, the SCN can be used to resolve the transcation partial order between two dependence metadata operations. Thus, it ensures the recovery order between MDTs.

      Attachments

        Issue Links

          Activity

            People

              qian_wc Qian Yingjin
              qian_wc Qian Yingjin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: