[LU-5949] Verify e2fsck inode badness is working for duplicate blocks Created: 23/Nov/14 Updated: 09/Feb/21 Resolved: 08/Feb/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Andreas Dilger | Assignee: | Andreas Dilger |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | e2fsck | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 16606 | ||||||||||||||||
| Description |
|
A number of cases of corrupt inodes have lead to large numbers of duplicate blocks during e2fsck runs. Corrupt inodes should be detected as such during e2fsck scanning and be cleared instead of invoking duplicate block resolution in pass1b. This may be complicated by running "e2fsck -fn" that does not clear these corrupt inodes during pass1. Need to verify that inode badness is working properly, and possibly figure a way that "e2fsck -fn" |
| Comments |
| Comment by Andreas Dilger [ 05/Dec/19 ] |
|
It is clear from several reports that the e2fsck inode badness is not working correctly for duplicate blocks. It is intended functionality is that if some inode is reporting duplicate blocks, but also had other errors such as bad flags, size, blocks, mode, etc. then it is likely that this inode is actually corrupt/garbage and should be cleared instead of spending a long time to clone the file blocks just to have a garbage inode with some other file's data. Avoiding the e2fsck pass1b/1c/1d process can save a very significant amount of time when running on a large/corrupt filesystem, and would be preferable to trying to repair such inodes. It has also been suggested that instead of the "inode badness" feature (which is not yet landed upstream because it touches a lot of different code) that instead e2fsck check all of the inodes in an itable block together, and if there is corruption across multiple inodes in that block block that this be used as the "badness" for the inode, and consider those inodes as targets for deletion if they are also found to be sharing blocks. I think it is potentially dangerous to always clear a block of inodes that are all showing corruption, in case this corruption is systematic (e.g. clearing some feature flag, updating the checksum after filesystem UUID was changed, etc). |
| Comment by Dongyang Li [ 11/Dec/19 ] |
|
I believe inode_badness patch just checks the inode fields without looking at the dup blocks at the moment, and offers to delete the inode in pass2 and pass4 if the badness is over the threshold. dup blocks are handled independently by e2fsck: allow deleting or zeroing shared blocks. I think we could check if the badness is over before pass1b, and skip/delete the inode before scanning for dup blocks. |
| Comment by Andreas Dilger [ 11/Dec/19 ] |
|
One difficulty with checking the badness before doing the duplicate scanning is that the outcome depends on the order the inodes were processed in. The e2fsck block bitmap only stores if a block is used for pass1, assuming that no duplicate blocks exist in the filesystem. If e2fsck finds any duplicate blocks it has to re-scan the whole filesystem to find the inodes that are sharing blocks. It is possible that we could avoid pass1 in some cases by clearing the bad inode rather than adding the blocks into the bitmap, assuming that the good inode is processed first and adds its blocks to the bitmap, and then the bad inode is found later and verified and does not add which would avoid pass1b sometimes. The other proposal that Ted Ts'o had was to check all inodes in a block at the same time. That makes it easier to detect if a whole block is corrupted. |
| Comment by Gerrit Updater [ 08/Feb/21 ] |
|
Andreas Dilger (adilger@whamcloud.com) merged in patch https://review.whamcloud.com/41328/ |
| Comment by Gerrit Updater [ 09/Feb/21 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41450 |
| Comment by Gerrit Updater [ 09/Feb/21 ] |
|
Andreas Dilger (adilger@whamcloud.com) merged in patch https://review.whamcloud.com/41450/ |