Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15985

e2fsck looping "Inode NNN block BBB conflicts with critical metadata"

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • None
    • None
    • 9223372036854775807

    Description

      In some cases it appears that e2fsck can become stuck in pass1 block checking with messages similar to:

      Inode 8304551 block 32881 conflicts with critical metadata, skipping block checks.
      Inode 8304551 block 32881 conflicts with critical metadata, skipping block checks.
      Inode 8304551 block 32881 conflicts with critical metadata, skipping block checks.
      Inode 8304551 block 32881 conflicts with critical metadata, skipping block checks.
      

      These repeat hundreds or thousands of times for a single inode, but eventually finish and the inode is marked as having too many errors and is cleared. It doesn't seem to be skipping the block checks at all, unless it is one or more blocks full of the same (bad) 32-bit block numbers and it is checking and ignoring all of them.

      It would be better if the handling of these errors short-circuited the thousands of lines of output and just cleared the inode (or at last the parent indirect block) immediately, since it will happen in the end anyway. This should probably be part of the inode badness functionality.

      Attachments

        Issue Links

          Activity

            [LU-15985] e2fsck looping "Inode NNN block BBB conflicts with critical metadata"

            Patch will be included in 1.47.7-wc2

            adilger Andreas Dilger added a comment - Patch will be included in 1.47.7-wc2

            "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/55802/
            Subject: LU-15985 e2fsck: skip block scanning if the inode is too bad
            Project: tools/e2fsprogs
            Branch: master-lustre
            Current Patch Set:
            Commit: cd423df181fba2de46dc632fcebda754a1a4f683

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/55802/ Subject: LU-15985 e2fsck: skip block scanning if the inode is too bad Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: Commit: cd423df181fba2de46dc632fcebda754a1a4f683

            "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/55802
            Subject: LU-15985 e2fsck: skip block scanning if the inode is too bad.
            Project: tools/e2fsprogs
            Branch: master-lustre
            Current Patch Set: 1
            Commit: 09dbf0091cbf0c1950aa895c33254d7e86b4c8ec

            gerrit Gerrit Updater added a comment - "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/55802 Subject: LU-15985 e2fsck: skip block scanning if the inode is too bad. Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: 1 Commit: 09dbf0091cbf0c1950aa895c33254d7e86b4c8ec
            dongyang Dongyang Li added a comment -

            I've been investigating NCP-62 and looks like for 1 same block from the same inode, this
            "Inode NNN block BBB conflicts with critical metadata" error could repeat for hundreds of times,
            I found examples of ~250 times, ~500 times and ~700 times, but I could not find exact reason why there
            are so many repeats, I guess because we recursively call scan_extent_node()? and the data is so badly corrupted?

            Having said that, even if we only manage to reduce the repeated error to the same block, if the inode's extent tree is so bad, we could still face many blocks suffering from the same issue, I think for now we can skip the extent scan for the inode if we ever see inode badness is above the threshold.

            dongyang Dongyang Li added a comment - I've been investigating NCP-62 and looks like for 1 same block from the same inode, this "Inode NNN block BBB conflicts with critical metadata" error could repeat for hundreds of times, I found examples of ~250 times, ~500 times and ~700 times, but I could not find exact reason why there are so many repeats, I guess because we recursively call scan_extent_node()? and the data is so badly corrupted? Having said that, even if we only manage to reduce the repeated error to the same block, if the inode's extent tree is so bad, we could still face many blocks suffering from the same issue, I think for now we can skip the extent scan for the inode if we ever see inode badness is above the threshold.

            Hit this again with e2fsck 1.47.1-wc1 spewing the same bad blocks for thousands of lines:

            [Thread 11] Inode 4057107 block 259 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 1792 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 335544832 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 2147483648 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 57 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 57 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 64 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 3 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 3 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 64 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 402653184 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 64 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 1694498816 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 69 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 1791 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 512 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 1536 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 1536 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 3072 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 800 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 2415919104 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 69 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 805306368 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 1216348231 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 30 conflicts with critical metadata, skipping block checks.
            [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
            :
            :
            

            There should be some feedback from these bad block conflicts to the inode badness counter, and pre-emptively stop the processing of this inode and clear the inode instead of looping for a long time (reportedly hours).

            adilger Andreas Dilger added a comment - Hit this again with e2fsck 1.47.1-wc1 spewing the same bad blocks for thousands of lines: [Thread 11] Inode 4057107 block 259 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 1792 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 335544832 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 2147483648 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 57 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 57 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 64 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 3 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 3 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 64 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 402653184 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 64 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 1694498816 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 69 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 1791 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 512 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 1536 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 1536 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 3072 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 800 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 2415919104 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 69 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 805306368 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 1216348231 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 30 conflicts with critical metadata, skipping block checks. [Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks. : : There should be some feedback from these bad block conflicts to the inode badness counter, and pre-emptively stop the processing of this inode and clear the inode instead of looping for a long time (reportedly hours).

            I think patch: https://review.whamcloud.com/48620 "LU-16171 e2fsck: improve pass1b bad inode handling" may also fix this issue, but I'm not sure since I don't have a test case yet.

            adilger Andreas Dilger added a comment - I think patch: https://review.whamcloud.com/48620 " LU-16171 e2fsck: improve pass1b bad inode handling " may also fix this issue, but I'm not sure since I don't have a test case yet.

            People

              dongyang Dongyang Li
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: