[LU-16171] e2fsck should handle multiply-claimed blocks better - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Fixed
Priority: Major
Fix Version/s: None
Affects Version/s: None
Labels:
None

Rank (Obsolete):
9223372036854775807

Description

Running e2fsck on a filesystem with a large number of multiply-claimed blocks can result in e2fsck running for many hours or possibly days. In many such cases, the multiply-claimed blocks are caused by a corrupted inode or indirect block that causes a bad inode to overlap with many good inodes. This problem is made worse when running on a large filesystem (16TB or more) because random 32-bit numbers in the inode->i_blocks[] array are always "valid" block numbers (with smaller filesystems the random block numbers would be detected as an error). Garbage triple/double/indirect blocks will point to random "valid" blocks that will themselves contain other 32-bit block numbers and multiply the number of duplicate blocks exponentially.

Rather than clone all of those blocks, or possibly deleting/zeroing all such inodes (as is suggested in ~~LU-13446~~) it would be better to find the "bad" inode(s) causing the most problems, and clear only them, rather than clearing all of the inodes with shared blocks. However, care should be taken to avoid spuriously clearing inodes that only share blocks with a small number of peers, as it is difficult to know for sure in this case which inode is the bad one.

An added difficulty in implementing this is that the full list of inodes sharing a given block is only available in pass1d, at which point it is already starting to clone the shared blocks. Some work might be possible in pass1b, by monitoring which inodes have the most shared blocks, but this isn't totally clear yet whether just counting the shared blocks is sufficient (divided by a factor like 4096 to avoid penalizing inodes that just have a bad indirect/index block), or if it is better to only count the shared inodes.

Further complicating implementing the solution is that the "dict" code in e2fsck is only adds duplicate inodes with shared clusters to the list, and never removes anything from the dict (this code is even #ifdef'd out in the library), so this will need additional development to get the dict removal code working correctly. At that point, the goal is if a particularly bad inode is found (sharing blocks with dozens of other inodes), it should be removed from the inode and cluster dictionaries, and hopefully the processing of all later inodes would be trivial since they no longer share any inodes.

Failing the "delete bad inode from dict" approach, it would be possible to clear the bad inode and restart e2fsck, but this might need a few restarts (full pass1 repeat) if there are multiple bad inodes (which is likely). However, that may still be preferable and faster (a couple of hours) than running pass1d for a very long time.

Attachments

Issue Links

is duplicated by

LU-15985 e2fsck looping "Inode NNN block BBB conflicts with critical metadata"

Resolved

is related to

LU-11577 fsck Multiply-claimed block(s)

Resolved

LU-4102 lots of multiply-claimed blocks in e2fsck

Resolved

LU-13446 Security hole in default e2fsck behavior for duplicate blocks

Resolved

LU-12913 fsck found > 1M multilply-claimed blocks

Resolved

Activity

People

Assignee:: Andreas Dilger

Reporter:: Andreas Dilger

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 19/Sep/22 7:44 PM

Updated:: 28/Nov/22 6:39 AM

Resolved:: 13/Oct/22 2:21 PM