Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-20047

backup last_rcvd file header for redundancy

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Medium
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      The last_rcvd file struct lr_server_data contains a number of configuration parameters for the target (UUID, feature flags, Object Index count, max_clients, etc.), along with transient data related to the client recovery state.

      While many of the values in lr_server_data are recoverable from other places (e.g. filesystem label, mountdata, OI count from actual OI files, etc.), some of them are not recoverable, and it would be convenient to have a backup read-mostly copy of struct lr_server_data that could be used if the last_rcvd file is lost for some reason (e.g. filesystem corruption, removed due to recovery problems, etc.)

      This backup last_rcvd file would only rarely need to be modified, like when some new parameter is written or feature flags are changed, but not when clients connect/disconnect or transactions are committed since the rest of the last_rcvd file that actually stores the client state would also have been lost in this case.

      Storing the backup in a separate directory, like CONFIGS/last_rcvd.bak, would generally allocate it from a different inode table, store it in a different parent directory, and allocate blocks from a different block group, so it is very likely to survive even substantial filesystem corruption. If the last_rcvd file is unavailable at mount, it would be possible to read from CONFIGS/last_rcvd.bak to recreate the last_rcvd file instead of creating it from scratch. In my home filesystem the MDT0000 last_rcvd is inode 12 and CONFIGS/ have inode numbers in the 500k range. On the more-recently-formatted MDT0001 the inode numbers are 83 and 3M respectively.

      On osd-zfs there is already ZFS-level mirroring of this file (LU-6218), but that would not help if the last_rcvd file is manually removed during recovery, so it is likely still worthwhile to keep a separate copy of this file.

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: