[LU-1026] ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 23828 corrupted Created: 24/Jan/12 Updated: 14/Jun/18 Resolved: 04/Dec/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.x (1.8.0 - 1.8.5) |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Shuichi Ihara (Inactive) | Assignee: | Hongchao Zhang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | p4hc, p4j | ||
| Environment: |
lustre-1.8.4 |
||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 10118 | ||||||||||||||||||||
| Description |
|
The last week, one of our customer got the corrupted messages in the ldiskfs, then OSS remounted that OST with readonly. Jan 19 18:25:46 lustre-oss-0-0 kernel: [8145936.472484] LDISKFS-fs error (device dm-5): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 23828corrupted: 4190 blocks free in bitmap, 4189 - in gd |
| Comments |
| Comment by Cliff White (Inactive) [ 24/Jan/12 ] |
|
Were you able to repair this with fsck? |
| Comment by Shuichi Ihara (Inactive) [ 24/Jan/12 ] |
|
yes and fixed, but still don't know why this problem happen. |
| Comment by Shuichi Ihara (Inactive) [ 24/Jan/12 ] |
|
another OSS's log file. |
| Comment by Shuichi Ihara (Inactive) [ 24/Jan/12 ] |
|
In addition, they got same messages on two OSTs. (I will file another /var/log/messages later) |
| Comment by Shuichi Ihara (Inactive) [ 24/Jan/12 ] |
|
e2fsck's log for OST0000 |
| Comment by Shuichi Ihara (Inactive) [ 24/Jan/12 ] |
|
the log file of e2fsck to OST0005. |
| Comment by Peter Jones [ 24/Jan/12 ] |
|
Hi Andreas Could you please have a look at this one? Thanks Peter |
| Comment by Andreas Dilger [ 24/Jan/12 ] |
|
Comparing the error message from messages-another-oss.gz it reports group 41776 was corrupted (blocks 1368915968-1368948735) and the e2fsck output reports block 1368942246, so these are related. I recall seeing bugs like this with Lustre 1.8 that were fixed by adding proper locking for the group descriptors, and fixing the extent cache code, but I'm not sure whether those patches are relevant to your version of ldiskfs or not. What kernel version is being used? Is this ext3-based ldiskfs or ext4-based ldiskfs? The useful error messages from the log, for future reference. messages.gz: messages-another-oss.gz: |
| Comment by Shuichi Ihara (Inactive) [ 24/Jan/12 ] |
|
Hi Andreas, They are using lustre-1.8.4 with ext4 based ldiskfs. The patches were landed in 1.8.1.1. The another similar problem is also filed on bugzilla as #22091, but this is also fixed in 1.8.2. If |
| Comment by Thomas Roth [ 30/Jan/12 ] |
|
We hit this today on a lustre-1.8.4, ext3-based ldiskfs OST: Running e2fsck 1.41.10.sun2 on that OST shreddered the file system, though. Perhaps specifying '-yf' on the cmd line was not a good idea, but I wouldn't know otherwise. The OST now doesn't mount (also as '-t ldiskfs') with |
| Comment by Thomas Roth [ 30/Jan/12 ] |
|
Kernel log messages |
| Comment by Thomas Roth [ 30/Jan/12 ] |
|
e2fsck output |
| Comment by Blake Caldwell [ 15/Feb/13 ] |
|
We just experienced a similar issue with lustre 1.8.8 on RHEL5.9 (2.6.18-308.4.1.el5). Could you comment whether this is related this issue The message in the kernel log: On running e2fsck (e2fsprogs-1.42.3.wc1-0redhat) we had the following: Free blocks count wrong for group #19519 (14336, counted=3547). Free blocks count wrong (2687836218, counted=2687825429). |
| Comment by Blake Caldwell [ 16/Feb/13 ] |
|
So ldiskfs just tripped up on the bitmap from group #19519, which is exactly what e2fsck said. It looks like e2fsck fixed the free block accounting and we are back to a consistent state. I don't see any relation to the duplicate messages reported by the reporter, so this was unrelated. |
| Comment by Gerrit Updater [ 12/Dec/14 ] |
|
Shilong Wang (wshilong@ddn.com) uploaded a new patch: http://review.whamcloud.com/13043 |
| Comment by Wang Shilong (Inactive) [ 12/Dec/14 ] |
|
We hit several bugs like these under rhel5: Nov 10 10:44:37 t2s007011 kernel: LDISKFS-fs error (device dm-10): ldiskfs_valid_block_bitmap: Invalid block bitmap - block_group = 100095, block = 3279912962 I think Following commit from upstream might be related: ext4: fix race when setting bitmap_uptodate flag In ext4_read_ {inode,block}_bitmap() we were setting bitmap_uptodate() Notice, if we don't totally load bitmap from disk, we might get a garbage block bitmap, Could someone comment this, same issue reported on: Best Regards, |
| Comment by Peter Jones [ 12/Dec/14 ] |
|
Bobijam/Lai Could you please review this proposed fix? Thanks Peter |
| Comment by Gerrit Updater [ 16/Dec/14 ] |
|
Shilong Wang (wshilong@ddn.com) uploaded a new patch: http://review.whamcloud.com/13080 |
| Comment by Shuichi Ihara (Inactive) [ 27/Feb/15 ] |
|
we hit same issue again, this time, debug patch (http://review.whamcloud.com/#change,1107) was enabled and got following addtinal messages below. Feb 14 18:33:51 t2s007001 kernel: ffffffffffffffff <2>ffffffffffffffff <2>ffffffffffffffff <2>LDISKFS-fs error (device dm-0): ldiskfs_valid_block_bitmap: Invalid block bitmap - group_first_block = 3412328448, block_bitmap = 3412328448, inode_bitmap = 3412328449 inode_table_bitmap = 3412328450, inode_table_block_per_group =512, next_zero_bit = 512, block_group = 104136, block = 3412328450 Feb 14 18:33:51 t2s007001 kernel: Aborting journal on device dm-0-8. Feb 14 18:33:51 t2s007001 kernel: LDISKFS-fs (dm-0): Remounting filesystem read-only we also applied patch http://review.whamcloud.com/13080, but it didn't help. |
| Comment by Peter Jones [ 27/Feb/15 ] |
|
Hongchao is looking into this issue |
| Comment by Hongchao Zhang [ 27/Feb/15 ] |
|
Hi Shuichi, as per the additional debug messages, it's the bitmap occupied by the "inode tables" that corrupted, the 512th bit (from zero) in the block bitmap is zero! the content of the block bitmap is also printed in the debug patch, could you please put the content of it here? Thanks! + printk(KERN_CRIT"block bitmap of block_group %d : \n", block_group); + for (i = 0; i < (sb->s_blocksize >> 3); i++) { + printk(KERN_CRIT"%016lx ", *(((long int*)bh->b_data) + i)); + if (i && ((i % 4) == 0)) + printk(KERN_CRIT"\n"); + } |
| Comment by Shuichi Ihara (Inactive) [ 27/Feb/15 ] |
|
Yes, I found following messages in the log. full log is attachd. Feb 14 18:33:51 t2s007001 kernel: block bitmap of block_group 104136 |
| Comment by Andreas Dilger [ 27/Feb/15 ] |
|
It looks like the on-disk bitmap has the right bit set: Feb 14 18:33:51 t2s007001 kernel: block bitmap of block_group 104136 : Feb 14 18:33:51 t2s007001 kernel: ffffffffffffffff <2>ffffffffffffffff <2>ffffffffffffffff <2>ffffffffffffffff <2>ffffffffffffffff <2> Feb 14 18:33:51 t2s007001 kernel: ffffffffffffffff <2>ffffffffffffffff <2>ffffffffffffffff <2>ffffffffffffffff <2> Feb 14 18:33:51 t2s007001 kernel: ffffffffffffffff <2>ffffffffffffffff <2>00007fffffffffff <2>ffffffffffffffff <2> Feb 14 18:33:51 t2s007001 kernel: LDISKFS-fs error (device dm-0): ldiskfs_valid_block_bitmap: Invalid block bitmap - group_first_block = 3412328448, block_bitmap = 3412328448, inode_bitmap = 3412328449 inode_table_bitmap = 3412328450, inode_table_block_per_group = 512, next_zero_bit = 512, block_group = 104136, block = 3412328450 Each block of ffff is 64 bits. While the output is a badly formatted, since it appears there are 5x 64 bits on the first line, but the code matches this so it looks correct. |
| Comment by Andreas Dilger [ 27/Feb/15 ] |
|
I guess one option to mitigate this bug if the root cause cannot be found is to just stop allocation from this group by setting the free block count to zero and move on to another group. It may even be that this is done in the upstream kernel? The error would need to be changed to a warning so that the filesystem is not change to read-only. |
| Comment by Wang Shilong (Inactive) [ 28/Feb/15 ] |
|
Hi Andreas, Looking at upstream kernel, following commit: commit 163a203ddb36c36d4a1c942aececda0cc8d06aa7 ext4: mark block group as corrupt on block bitmap error This commit will make us to avoid further allocating/deallocation with corrupt block group. |
| Comment by Wang Shilong (Inactive) [ 28/Feb/15 ] |
|
We had following messages outputing when fsck: e2fsck 1.42.9.wc1 (24-Feb-2014) Free inodes count wrong (937631607, counted=937484879). work0-OST0000: ***** FILE SYSTEM WAS MODIFIED ***** Notice Free blocks count are some differences here, and it seems fsck could not fix bitmap errors(no?), it seems |
| Comment by Hongchao Zhang [ 28/Feb/15 ] |
|
there is a ticket to track the problem of wrong count of "Free blocks" and "Free inodes" in the upstream kernel patch https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git/commit/?id=163a203ddb36c36d4a1c942aececda0cc8d06aa7 I wonder whether this issue is caused by some temporal failure somewhere or some race(for the buffer isn't locked) for the following content printed --- linux-2.6.18-348.1.1.el5.x86_64.orig/fs/ext4/balloc.c 2014-11-24 05:32:30.894982225 +0800 +++ linux-2.6.18-348.1.1.el5.x86_64/fs/ext4/balloc.c 2014-11-24 05:50:11.287001297 +0800 @@ -275,6 +275,13 @@ static int ext4_valid_block_bitmap(struc /* good bitmap for inode tables */ return 1; + smp_mb(); + next_zero_bit = ext4_find_next_zero_bit(bh->b_data, + offset + EXT4_SB(sb)->s_itb_per_group, + offset); + if (next_zero_bit >= offset + EXT4_SB(sb)->s_itb_per_group) + return 1; + err_out: ext4_error(sb, "Invalid block bitmap - block_group = %d, block = %llu", block_group, bitmap_blk); @@ -350,7 +357,11 @@ ext4_read_block_bitmap(struct super_bloc block_group, bitmap_blk); return NULL; } + + lock_buffer(bh); ext4_valid_block_bitmap(sb, desc, block_group, bh); + unlock_buffer(bh); + /* * file system mounted not to panic on error, * continue with corrupt bitmap |
| Comment by Andreas Dilger [ 01/Mar/15 ] |
|
Note that the block and inode summary in the superblock is only updated during a clean unmount. It is not updated during normal usage since the per-group totals are used instead. |
| Comment by Wang Shilong (Inactive) [ 05/Mar/15 ] |
|
Hi Hong Chao, Could you give a formal patch(with your suggestion)? |
| Comment by Gerrit Updater [ 06/Mar/15 ] |
|
Shilong Wang (wshilong@ddn.com) uploaded a new patch: http://review.whamcloud.com/13991 |
| Comment by Gerrit Updater [ 06/Mar/15 ] |
|
Shilong Wang (wshilong@ddn.com) uploaded a new patch: http://review.whamcloud.com/13992 |
| Comment by Hongchao Zhang [ 07/Mar/15 ] |
|
Hi Shilong, sorry for delayed response, and thanks you very much for creating the corresponding patch! |
| Comment by Hongchao Zhang [ 10/Mar/15 ] |
|
Hi Shilong, Do you manage to test with the new patch? and what is the result? Thanks! |
| Comment by Wang Shilong (Inactive) [ 10/Mar/15 ] |
|
Hi HongChao, This bug was hard to reproduce, it happend in customers' machine several months, so we are going to apply the patch, |
| Comment by Gerrit Updater [ 30/Sep/15 ] |
|
Wang Shilong (wshilong@ddn.com) uploaded a new patch: http://review.whamcloud.com/16679 |
| Comment by Gerrit Updater [ 04/Dec/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16679/ |
| Comment by Joseph Gmitter (Inactive) [ 04/Dec/15 ] |
|
Landed for 2.8 |
| Comment by Andreas Dilger [ 16/Jun/16 ] |
|
Shilong, ext4: make bitmap corruption not fatal
There can be occasional reasons for bitmap problems, which are
detected by ext4_mb_check_ondisk_bitmap() and cause the
filesystem to be remounted read-only due to ext4_error():
EXT4-fs error (device /dev/dm-6-8): ext4_mb_generate_buddy:755:
group 294, block 0: block bitmap and bg descriptor inconsistent:
20180 vs 20181 free clusters
Aborting journal on device dm-6-8.
EXT4-fs (dm-6): Remounting filesystem read-only
This might be caused by some ext4 internal bugs, which are addressed
separately. This patch makes ext4 more robust by the following changes:
- ext4_read_block_bitmap() printed error, so do not call ext4_error() again
- mark all bits in bitmap used so that it will not be used for allocation
- mark block group corrupt, use ext4_warning() instead of ext4_error()
Tested by following script:
TEST_DEV="/dev/sdb"
TEST_MNT="/mnt/ext4"
mkdir -p $TEST_MNT
mkfs.ext4 -F $TEST_DEV
mount -t ext4 $TEST_DEV $TEST_MNT
dd if=/dev/zero of=$TEST_MNT/largefile oflag=direct bs=10485760 count=200
umount $TEST_MNT
dd if=/dev/zero of=$TEST_DEV oflag=direct bs=4096 seek=641 count=10
mount -t ext4 $TEST_DEV $TEST_MNT
rm -f $TEST_MNT/largefile
dd if=/dev/zero of=$TEST_MNT/largefile oflag=direct bs=10485760 count=200 &&
echo "FILESYSTEM still usable after bitmaps corrupts happen"
umount $TEST_MNT
e2fsck $TEST_DEV -y
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1026
Reviewed-on: http://review.whamcloud.com/16679
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
|
| Comment by Mahmoud Hanafi [ 05/Apr/17 ] |
|
We may have hit this bug after large amount of data was deleted, from a nearly full filesystem, and then data was being written again. We are going to see if we can reproduce it on our test filesystem. Can we get a backport to 2.7.2fe. Thanks, |