Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8288

handle error due to file with "no stripe info" rewritten before lfsck is run

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • Lustre 2.7.0
    • None
    • 3
    • 9223372036854775807

    Description

      This is a followup on the filesystem recovery efforts from LU-8071, in particular the comment:

      If you think that the layout LFSCK made wrong decision when re-generated the
      "nagtest.toobig.stripes" LOV EA, we need to make new patch to recover it. 
      

      More than just making a wrong decision, lfsck can actually corrupt files when it is run. The case is where the MDT loses stripe information, and then the file is rewritten (or appeneded to?), and then lfsck is run.

      In general, it would be good if lfsck can handle "conflicts" more gracefully. I understand that it may not know which object is the right one, but it should not pick them arbitrarily since that can result in a mixed-data file. Additionally, at the time when lfsck is run, it has information about what file an object is associated with, and that could be exposed to the user in the name of the file placed in lost+found.

      Attachments

        Issue Links

          Activity

            [LU-8288] handle error due to file with "no stripe info" rewritten before lfsck is run
            mdiep Minh Diep added a comment -

            Landed for 2.10

            mdiep Minh Diep added a comment - Landed for 2.10
            jaylan Jay Lan (Inactive) added a comment - - edited

            Could you port this patch to b2_7_fe and land to b2_9_fe? Thanks!

            jaylan Jay Lan (Inactive) added a comment - - edited Could you port this patch to b2_7_fe and land to b2_9_fe? Thanks!

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/21562/
            Subject: LU-8288 lfsck: handle dangling LOV EA reference
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 17cc912fd5b40965d14a89a268cbf2d63b2fe21b

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/21562/ Subject: LU-8288 lfsck: handle dangling LOV EA reference Project: fs/lustre-release Branch: master Current Patch Set: Commit: 17cc912fd5b40965d14a89a268cbf2d63b2fe21b

            Per earlier discussion in this ticket, it would be worthwhile to backport the PFL patches to increase the MDT and OST inode size, as well as the patch to improve the fid xattr to store the total stripe count and stripe size on each OST object. That would allow LFSCK to reconstruct the layout properly, even in the case where some OST objects are totally missing. Having clients send this information with each write will ensure that this information is stored on each OST object for later use if needed.

            adilger Andreas Dilger added a comment - Per earlier discussion in this ticket, it would be worthwhile to backport the PFL patches to increase the MDT and OST inode size, as well as the patch to improve the fid xattr to store the total stripe count and stripe size on each OST object. That would allow LFSCK to reconstruct the layout properly, even in the case where some OST objects are totally missing. Having clients send this information with each write will ensure that this information is stored on each OST object for later use if needed.
            pjones Peter Jones added a comment -

            That matches my understanding Nathan

            pjones Peter Jones added a comment - That matches my understanding Nathan

            It looks like there were many iterations on this patch, but it is ready for final review and then landing. Please confirm.

            Also, once the patch is finalized, we will need a backport to the 2.7 FE branch as well as master. Thanks!

            ndauchy Nathan Dauchy (Inactive) added a comment - It looks like there were many iterations on this patch, but it is ready for final review and then landing. Please confirm. Also, once the patch is finalized, we will need a backport to the 2.7 FE branch as well as master. Thanks!

            Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/21562
            Subject: LU-8288 lfsck: handle dangling LOV EA reference
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 61dc2ac65258fceb30bf0549e76b8ff7eace2d29

            gerrit Gerrit Updater added a comment - Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/21562 Subject: LU-8288 lfsck: handle dangling LOV EA reference Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 61dc2ac65258fceb30bf0549e76b8ff7eace2d29

            People

              yong.fan nasf (Inactive)
              ndauchy Nathan Dauchy (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: