These repeat hundreds or thousands of times for a single inode, but eventually finish and the inode is marked as having too many errors and is cleared. It doesn't seem to be skipping the block checks at all, unless it is one or more blocks full of the same (bad) 32-bit block numbers and it is checking and ignoring all of them.
It would be better if the handling of these errors short-circuited the thousands of lines of output and just cleared the inode (or at last the parent indirect block) immediately, since it will happen in the end anyway. This should probably be part of the inode badness functionality.
Attachments
Issue Links
duplicates
LU-16171e2fsck should handle multiply-claimed blocks better
"Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/55802/
Subject: LU-15985 e2fsck: skip block scanning if the inode is too bad
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set:
Commit: cd423df181fba2de46dc632fcebda754a1a4f683
Gerrit Updater
added a comment - "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/55802/
Subject: LU-15985 e2fsck: skip block scanning if the inode is too bad
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set:
Commit: cd423df181fba2de46dc632fcebda754a1a4f683
"Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/55802
Subject: LU-15985 e2fsck: skip block scanning if the inode is too bad.
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set: 1
Commit: 09dbf0091cbf0c1950aa895c33254d7e86b4c8ec
Gerrit Updater
added a comment - "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/55802
Subject: LU-15985 e2fsck: skip block scanning if the inode is too bad.
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set: 1
Commit: 09dbf0091cbf0c1950aa895c33254d7e86b4c8ec
I've been investigating NCP-62 and looks like for 1 same block from the same inode, this
"Inode NNN block BBB conflicts with critical metadata" error could repeat for hundreds of times,
I found examples of ~250 times, ~500 times and ~700 times, but I could not find exact reason why there
are so many repeats, I guess because we recursively call scan_extent_node()? and the data is so badly corrupted?
Having said that, even if we only manage to reduce the repeated error to the same block, if the inode's extent tree is so bad, we could still face many blocks suffering from the same issue, I think for now we can skip the extent scan for the inode if we ever see inode badness is above the threshold.
Dongyang Li
added a comment - I've been investigating NCP-62 and looks like for 1 same block from the same inode, this
"Inode NNN block BBB conflicts with critical metadata" error could repeat for hundreds of times,
I found examples of ~250 times, ~500 times and ~700 times, but I could not find exact reason why there
are so many repeats, I guess because we recursively call scan_extent_node()? and the data is so badly corrupted?
Having said that, even if we only manage to reduce the repeated error to the same block, if the inode's extent tree is so bad, we could still face many blocks suffering from the same issue, I think for now we can skip the extent scan for the inode if we ever see inode badness is above the threshold.
There should be some feedback from these bad block conflicts to the inode badness counter, and pre-emptively stop the processing of this inode and clear the inode instead of looping for a long time (reportedly hours).
Andreas Dilger
added a comment - Hit this again with e2fsck 1.47.1-wc1 spewing the same bad blocks for thousands of lines:
[Thread 11] Inode 4057107 block 259 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 1792 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 335544832 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 2147483648 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 57 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 57 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 64 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 3 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 3 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 64 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 402653184 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 64 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 1694498816 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 69 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 1791 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 512 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 1536 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 1536 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 3072 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 800 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 2415919104 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 69 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 805306368 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 65 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 1216348231 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 30 conflicts with critical metadata, skipping block checks.
[Thread 11] Inode 4057107 block 256 conflicts with critical metadata, skipping block checks.
:
:
There should be some feedback from these bad block conflicts to the inode badness counter, and pre-emptively stop the processing of this inode and clear the inode instead of looping for a long time (reportedly hours).
I think patch: https://review.whamcloud.com/48620 "LU-16171 e2fsck: improve pass1b bad inode handling" may also fix this issue, but I'm not sure since I don't have a test case yet.
Andreas Dilger
added a comment - I think patch: https://review.whamcloud.com/48620 " LU-16171 e2fsck: improve pass1b bad inode handling " may also fix this issue, but I'm not sure since I don't have a test case yet.
Patch will be included in 1.47.7-wc2