Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15645

gap in recovery llog should not be a fatal error

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.15.0, Lustre 2.12.10
    • Lustre 2.14.0
    • None
    • 3
    • 9223372036854775807

    Description

      A gap in the MDT recovery llog (of unknown origin) was hit during recovery.

      log_process_thread()) lfs02-MDT001e-osp-MDT0000: [0x3:0x1b70:0x4] Invalid record: index 16123 but expected 16122
      

      and this was later confirmed with llog_reader:

      rec #15221 type=106a0000 len=1160 offset 17231040
      rec #16097 type=106a0000 len=1160 offset 18220168
      rec #16098 type=106a0000 len=1160 offset 18221328
      rec #16099 type=106a0000 len=1160 offset 18222488
      rec #16100 type=106a0000 len=1160 offset 18223648
      Previous index is 16121, current 16123, offset 18249168
      rec #18718 type=106a0000 len=1160 offset 21180888
      rec #20278 type=106a0000 len=1160 offset 22943400
      

      This caused the MDT recovery to fail and all of the clients were evicted from that MDT. It isn't clear whether the global eviction is necessary, or if this should be handled more gracefully? Other MDTs likely have a copy of that operation for replay, and if not then it would be lost.

      What is more problematic is that this recovery llog error is persistent, and the same problem happens on every recovery for that MDT. If the clients (and MDTs?) are evicted from recovery, the llog records should at a minimum be cancelled, or the llog file should be cleared. Better yet would be to not treat this gap as a fatal error, since I don't think there is anything that can be done about it at this point.

      Attachments

        Issue Links

          Activity

            [LU-15645] gap in recovery llog should not be a fatal error

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47011/
            Subject: LU-15645 obdclass: llog to handle gaps
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 4a4e38a2769089ddf2430983c2d607683cd12986

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47011/ Subject: LU-15645 obdclass: llog to handle gaps Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 4a4e38a2769089ddf2430983c2d607683cd12986
            pjones Peter Jones added a comment -

            Landed for 2.15

            pjones Peter Jones added a comment - Landed for 2.15

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/46837/
            Subject: LU-15645 obdclass: llog to handle gaps
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 903f2f663956fef380b9f383e73a05b7beb0baa5

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/46837/ Subject: LU-15645 obdclass: llog to handle gaps Project: fs/lustre-release Branch: master Current Patch Set: Commit: 903f2f663956fef380b9f383e73a05b7beb0baa5

            "Mike Pershin <mpershin@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47011
            Subject: LU-15645 obdclass: llog to handle gaps
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 9e8fd74dc5d3e60163884cf51ad27dc6dba7c72f

            gerrit Gerrit Updater added a comment - "Mike Pershin <mpershin@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47011 Subject: LU-15645 obdclass: llog to handle gaps Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 9e8fd74dc5d3e60163884cf51ad27dc6dba7c72f

            Etienne, I don't think there is anything done to rewrite the blog with the gap, it is just skipped without causing the recovery to fail.

            adilger Andreas Dilger added a comment - Etienne, I don't think there is anything done to rewrite the blog with the gap, it is just skipped without causing the recovery to fail.

            Hello,
            What is the behavior when the corrupted llog block is rewritten?
            Is there a risk where a write on the missing record overwrite an existing one?

            eaujames Etienne Aujames added a comment - Hello, What is the behavior when the corrupted llog block is rewritten? Is there a risk where a write on the missing record overwrite an existing one?

            "Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46837
            Subject: LU-15645 obdclass: llog to handle gaps
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 7fa151797bbba83d231828963fa66a288ede1de0

            gerrit Gerrit Updater added a comment - "Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46837 Subject: LU-15645 obdclass: llog to handle gaps Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7fa151797bbba83d231828963fa66a288ede1de0

            I was wondering about the potential sources of a gap in the recovery llog. As you wrote, if there was an actual gap in the updates applied to the MDT objects, then that should be caught by VBR.

            I think this is a gap in the numerbering of the OUT records in the llog, which seems different. That might be caused by the llog header being written non-atomically with the llog body, which I recall was a bug that was fixed by Mike a while ago. However, it isn't clear if this gap in the llog numbering is a "real" problem or not? If there are clients waiting on the recovery of this transaction, wouldn't they have it pending replay in their own recovery logs also?

            In either case, if the clients are evicted, then definitely the recovery log needs to be cleaned up so that this gap does not cause future problems.

            adilger Andreas Dilger added a comment - I was wondering about the potential sources of a gap in the recovery llog. As you wrote, if there was an actual gap in the updates applied to the MDT objects, then that should be caught by VBR. I think this is a gap in the numerbering of the OUT records in the llog, which seems different. That might be caused by the llog header being written non-atomically with the llog body, which I recall was a bug that was fixed by Mike a while ago. However, it isn't clear if this gap in the llog numbering is a "real" problem or not? If there are clients waiting on the recovery of this transaction, wouldn't they have it pending replay in their own recovery logs also? In either case, if the clients are evicted, then definitely the recovery log needs to be cleaned up so that this gap does not cause future problems.

            I think that VBR checks should ensure that there is no real gap in the transaction (otherwise recovery abort is unavoidable). so there are two major scenario here:
            1) there is a gap in one or few llogs, but the corresponding transaction is duplicated in another llog. in this case skipping such a gap should be transparent.
            2) the gap is "global" (i.e. the corresponding transaction is missing in all the llogs), then we have to abort recovery and cancel all subsequent llogs recods so they don't cause any problem on the next mount

            bzzz Alex Zhuravlev added a comment - I think that VBR checks should ensure that there is no real gap in the transaction (otherwise recovery abort is unavoidable). so there are two major scenario here: 1) there is a gap in one or few llogs, but the corresponding transaction is duplicated in another llog. in this case skipping such a gap should be transparent. 2) the gap is "global" (i.e. the corresponding transaction is missing in all the llogs), then we have to abort recovery and cancel all subsequent llogs recods so they don't cause any problem on the next mount

            People

              bzzz Alex Zhuravlev
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: