Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6218

osd-zfs: increase redundancy for OST meta data

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.8.0
    • None
    • 17393

    Description

      A site had two last_rcvd files corrupted on two OSTs. They were able to truncate the files and the OSTs mounted OK. But I wonder whether we could increase data redundancy for meta data such as the last_rcvd file, to make it harder to corrupt in the first place (or more accurately to make it easier for scrub to repair it should it ever get corrupted).

      The OIs already get two copies of its data blocks as they are ZAPs. But other meta data like last_rcvd get only one copy of the data. The copies property can only be applied at per file system granularity. We can put those files under a separate dataset, e.g. lustre-ost1/ost1/META, and set copies=2 for it. But it'd complicate the code as now there's two datasets per OST.

      Attachments

        Activity

          [LU-6218] osd-zfs: increase redundancy for OST meta data

          The patch added one additional copy for data blocks of a small number of small files, e.g. last_rcvd. The added overhead is trivial compared to the OIs which already get an additional copy.

          isaac Isaac Huang (Inactive) added a comment - The patch added one additional copy for data blocks of a small number of small files, e.g. last_rcvd. The added overhead is trivial compared to the OIs which already get an additional copy.
          pjones Peter Jones added a comment -

          Isaac

          Are you able to answer Chris's question about performance?

          Peter

          pjones Peter Jones added a comment - Isaac Are you able to answer Chris's question about performance? Peter

          Are there any performance implications from this change? Performance is already a problem on MDTs. This redundancy applies there as well, yes? Is the impact reasonable enough to make this the default there?

          morrone Christopher Morrone (Inactive) added a comment - Are there any performance implications from this change? Performance is already a problem on MDTs. This redundancy applies there as well, yes? Is the impact reasonable enough to make this the default there?

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13741/
          Subject: LU-6218 osd-zfs: increase redundancy for meta data
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: d9e86108724c06e3e6d25081caaf5803abf4416c

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13741/ Subject: LU-6218 osd-zfs: increase redundancy for meta data Project: fs/lustre-release Branch: master Current Patch Set: Commit: d9e86108724c06e3e6d25081caaf5803abf4416c

          adilger Do you happen to know what size the fs_log_size() in test-framewrok.sh returns? I'm wondering whether I should double the size returned for osd-zfs, but couldn't figure out what size fs_log_size() was actually returning.

          isaac Isaac Huang (Inactive) added a comment - adilger Do you happen to know what size the fs_log_size() in test-framewrok.sh returns? I'm wondering whether I should double the size returned for osd-zfs, but couldn't figure out what size fs_log_size() was actually returning.

          Great.

          adilger Andreas Dilger added a comment - Great.

          People

            isaac Isaac Huang (Inactive)
            isaac Isaac Huang (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: