Details
-
Improvement
-
Resolution: Done
-
Minor
-
None
-
9223372036854775807
Description
Currently, we might hit corrupted inode/bitmaps:
1. sanity checks failed, for example system reserved bitmaps are freed, this might because of some unknown kernel bugs.
2. some hardware errors, we did happen such errors in our corruption tests.
Whatever way, Filesystem will become RO in default, and FS become unusable, See a corresponding Bug Reports from LU-1026.
Here is Suggestions From Andreas Dilgerr:
I seem to recall something similar in the upstream kernel. It looks like patches with a similar goal were pushed already for the 3.12 kernel, so you might consider to backport those instead:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=48d9eb97dc74d2446bcc3630c8e51d2afc9b951d
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=dbde0abed8c6e9e938c2194675ce63f5769b0d37
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=163a203ddb36c36d4a1c942aececda0cc8d06aa7
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=87a39389be3e3b007d341be510a7e4a0542bdf05
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=bdfb6ff4a255dcebeb09a901250e13a97eff75af
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=2746f7a17062d3526116f7ae7f91d88b19c2464e
These patches don't prevent the filesystem from being marked read-only however, so you may still want to change the ext4_error() to ext4_warning(). There is also an important fix in the first patch for the caller of this function to ensure that it doesn't continue to use the bad bitmap if there is an error. The last patch is also important because it avoids freeing blocks in this group that might get reallocated later.