[LU-9410] on-disk bitmap corrupted - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: Lustre 2.10.1, Lustre 2.11.0
Affects Version/s: Lustre 2.7.0
Labels:
None

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

We had 2 OSS and 3 different OST crash with bitmap corrupted messages.

Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs error (device dm-42): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 245659corrupted: 32768 blocks free in bitmap, 0 - in gd
Apr  3 18:38:16 nbp1-oss6 kernel: 
Apr  3 18:38:16 nbp1-oss6 kernel: Aborting journal on device dm-3.
Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs (dm-42): Remounting filesystem read-only
Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs error (device dm-42): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 245660corrupted: 32768 blocks free in bitmap, 0 - in gd

These errors were on 2 different backend RAID devices. Note worthy items:
1 .The filesystem was +90% full and 1/2 of the data was deleted.
2. OSTs are formatted with " -E packed_meta_blocks=1 "

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

bt.2017-07-26-02.48.00
765 kB
26/Jul/17 8:25 PM
bt.2017-07-26-12.08.43
808 kB
26/Jul/17 8:25 PM
foreach.out
736 kB
26/Jul/17 4:00 AM
mballoc.c
145 kB
12/Aug/17 2:25 AM
ost258.dumpe2fs.after.fsck.gz
34.46 MB
11/Aug/17 5:27 PM
ost258.dumpe2fs.after.readonly.gz
34.44 MB
11/Aug/17 5:27 PM
syslog.gp270808.error.gz
13.37 MB
15/Aug/17 2:37 AM
vmcore-dmesg.txt
512 kB
26/Jul/17 4:00 AM

Issue Links

duplicates

LU-1026 ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 23828 corrupted

Resolved

LU-7114 ldiskfs: corrupted bitmaps handling patches

Resolved

is related to

LU-10837 no bitmap check if block bitmap is uninitialized

Resolved

Activity

[LU-9410] on-disk bitmap corrupted

Mahmoud Hanafi added a comment - 18/Aug/17 1:07 AM

Sorry I typed the patch number. I wanted to say it is stable with 28550.

Mahmoud Hanafi added a comment - 18/Aug/17 1:07 AM Sorry I typed the patch number. I wanted to say it is stable with 28550.

nasf (Inactive) added a comment - 18/Aug/17 12:47 AM

The patch 28550 will take effect before 28566, so if 28550 is applied, then 28566 is meaningless. But 28550 may do more things than the necessary fixes. I am afraid of some penitential side-effect.

The filesystem is stable with the workaround patch (/28489/). Can we run with this patch for sometime without any underlining filesystem issues? Or should we replace it with 28566 ASAP.

It is interesting to know that. Because 28489 is just a debug patch, I cannot imagine how it can resolve your issue. It may because your system has jumped over the groups with "BLOCK_UNINIT" flag and zero free blocks in GDP. If it is true, then applying 28566 will not show you more benefit. Since your system is stable running, you can replace the patches with 28566 when it 'corrupted' next time.

nasf (Inactive) added a comment - 18/Aug/17 12:47 AM The patch 28550 will take effect before 28566, so if 28550 is applied, then 28566 is meaningless. But 28550 may do more things than the necessary fixes. I am afraid of some penitential side-effect. The filesystem is stable with the workaround patch (/28489/). Can we run with this patch for sometime without any underlining filesystem issues? Or should we replace it with 28566 ASAP. It is interesting to know that. Because 28489 is just a debug patch, I cannot imagine how it can resolve your issue. It may because your system has jumped over the groups with "BLOCK_UNINIT" flag and zero free blocks in GDP. If it is true, then applying 28566 will not show you more benefit. Since your system is stable running, you can replace the patches with 28566 when it 'corrupted' next time.

Mahmoud Hanafi added a comment - 17/Aug/17 10:32 PM

The filesystem is stable with the workaround patch (/28489/). Can we run with this patch for sometime without any underlining filesystem issues? Or should we replace it with 28566 ASAP.

Mahmoud Hanafi added a comment - 17/Aug/17 10:32 PM The filesystem is stable with the workaround patch ( /28489/ ). Can we run with this patch for sometime without any underlining filesystem issues? Or should we replace it with 28566 ASAP.

Jay Lan (Inactive) added a comment - 17/Aug/17 6:41 PM - edited

I did a build with #28566 and #28550 yesterday. For testing purpose, do these two conflict?
I will undo #28550, but if these two do not collide, we can do testing with the builds I did yesterday.

Never mind. I just did another build with #28550 pulled out.

Jay Lan (Inactive) added a comment - 17/Aug/17 6:41 PM - edited I did a build with #28566 and #28550 yesterday. For testing purpose, do these two conflict? I will undo #28550, but if these two do not collide, we can do testing with the builds I did yesterday. Never mind. I just did another build with #28550 pulled out.

nasf (Inactive) added a comment - 17/Aug/17 2:03 PM

mhanafi, I have to say that this issue may be related with the improperly bitmap consistency verification in our ldiskfs patch without handling flex_bg case. I made a patch https://review.whamcloud.com/28566 to handle related issues. Would you pleas to try (no need other former patches). Thanks!

nasf (Inactive) added a comment - 17/Aug/17 2:03 PM mhanafi , I have to say that this issue may be related with the improperly bitmap consistency verification in our ldiskfs patch without handling flex_bg case. I made a patch https://review.whamcloud.com/28566 to handle related issues. Would you pleas to try (no need other former patches). Thanks!

Gerrit Updater added a comment - 16/Aug/17 1:18 PM

Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/28566
Subject: ~~LU-9410~~ ldiskfs: no check mb bitmap if flex_bg enabled
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8332a30959750c603bc572db1fcde8bc92f82a40

Gerrit Updater added a comment - 16/Aug/17 1:18 PM Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/28566 Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8332a30959750c603bc572db1fcde8bc92f82a40

Mahmoud Hanafi added a comment - 16/Aug/17 1:07 PM - edited

here is part of dmesg. The high rate of messages caused the root drive scsi device to reset. But all but one server recovered. I had to turn down printk log level down to get the last one to recover.

LDISKFS-fs warning (device dm-33): ldiskfs_init_block_bitmap: Set free blocks as 32768 for group 262310

LDISKFS-fs warning (device dm-33): ldiskfs_init_block_bitmap: Set free blocks as 32768 for group 262311

LDISKFS-fs warning (device dm-33): ldiskfs_init_block_bitmap: Set free blocks as 32768 for group 262312

LDISKFS-fs warning (device dm-33): ldiskfs_init_block_bitmap: Set free blocks as 32768 for group 262313

LDISKFS-fs warning (device dm-33): ldiskfs_init_block_bitmap: Set free blocks as 32768 for group 262314
LNet: 12178:0:(lib-move.c:1487:lnet_parse_put()) Dropping PUT from 12345-10.149.2.156@o2ib313 portal 28 match 1575300167923792 offset 0 length 520: 4
LNet: 12178:0:(lib-move.c:1487:lnet_parse_put()) Skipped 978380 previous similar messages
sd 0:0:1:0: attempting task abort! scmd(ffff880af433e0c0)
sd 0:0:1:0: [sdb] CDB: Write(10): 2a 00 00 a0 08 08 00 00 08 00
scsi target0:0:1: handle(0x000a), sas_address(0x4433221102000000), phy(2)
scsi target0:0:1: enclosure_logical_id(0x50030480198f7e01), slot(2)
scsi target0:0:1: enclosure level(0x0000),connector name(    ^C)
sd 0:0:1:0: task abort: SUCCESS scmd(ffff880af433e0c0)
sd 0:0:1:0: attempting task abort! scmd(ffff880a64ab46c0)
sd 0:0:1:0: [sdb] CDB: Write(10): 2a 00 00 e0 08 08 00 00 08 00
scsi target0:0:1: handle(0x000a), sas_address(0x4433221102000000), phy(2)
scsi target0:0:1: enclosure_logical_id(0x50030480198f7e01), slot(2)
scsi target0:0:1: enclosure level(0x0000),connector name(    ^C)
sd 0:0:1:0: task abort: SUCCESS scmd(ffff880a64ab46c0)
sd 0:0:1:0: attempting task abort! scmd(ffff880b21cec180)
sd 0:0:1:0: [sdb] CDB: Write(10): 2a 00 00 c0 08 08 00 00 08 00
scsi target0:0:1: handle(0x000a), sas_address(0x4433221102000000), phy(2)
DISKFS-fs (dm-23): mounted filesystem with ordered data mode. quota=on. Opts: 
LDISKFS-fs (dm-34): mounted filesystem with ordered data mode. quota=on. Opts: 
mounted filesystem with ordered data mode. quota=on. Opts: 
LDISKFS-fs (dm-29): mounted filesystem with ordered data mode. quota=on. Opts: 

LDISKFS-fs (dm-18): mounted filesystem with ordered data mode. quota=on. Opts: 
Lustre: nbp2-OST0081: Not available for connect from 10.151.43.107@o2ib (not set up)
Lustre: Skipped 3 previous similar messages
Lustre: nbp2-OST0081: Not available for connect from 10.151.29.130@o2ib (not set up)
Lustre: Skipped 113 previous similar messages
Lustre: nbp2-OST0081: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
Lustre: nbp2-OST0081: Will be in recovery for at least 2:30, or until 14441 clients reconnect
Lustre: nbp2-OST0081: Denying connection for new client 35b99837-9505-fc4d-270f-f2d1ca30372d (at 10.151.30.176@o2ib), waiting for all 14441 known clients (44 recovered, 1 in progress, and 0 evicted) to recover in 5:10

Here is /var/log/messages

Aug 11 17:58:25 nbp2-oss10 kernel: LNet: 12075:0:(lib-move.c:1487:lnet_parse_put()) Dropping PUT from 12345-10.151.30.120@o2ib portal 28 match 1575477031778096 offset 0 length 520: 4
Aug 11 17:58:25 nbp2-oss10 kernel: LNet: 12075:0:(lib-move.c:1487:lnet_parse_put()) Skipped 1037319 previous similar messages
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-30):
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-28): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-31): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-18): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-21): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-19): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-22): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-20): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-26): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-33): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-23): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-32): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:40 nbp2-oss10 kernel: LDISKFS-fs (dm-34): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:40 nbp2-oss10 kernel: LDISKFS-fs (dm-24): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:40 nbp2-oss10 kernel: LDISKFS-fs (dm-25): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:40 nbp2-oss10 kernel: 
Aug 11 18:03:41 nbp2-oss10 kernel: LDISKFS-fs (dm-29):
Aug 11 18:03:41 nbp2-oss10 kernel: LDISKFS-fs (dm-35): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:41 nbp2-oss10 kernel: mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:49 nbp2-oss10 kernel: LDISKFS-fs (dm-27): mounted filesystem with ordered data mode. quota=on. Opts:
Aug 11 18:03:50 nbp2-oss10 kernel: LustreError: 137-5: nbp2-OST0009_UUID: not available for connect from 10.151.50.143@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
Aug 11 18:03:50 nbp2-oss10 kernel: LustreError: Skipped 314 previous similar messages
Aug 11 18:03:51 nbp2-oss10 kernel: Lustre: nbp2-OST00d1: Not available for connect from 10.151.9.177@o2ib (not set up)
Aug 11 18:03:51 nbp2-oss10 kernel: Lustre: Skipped 11 previous similar messages
Aug 11 18:03:51 nbp2-oss10 kernel: LustreError: 137-5: nbp2-OST0009_UUID: not available for connect from 10.151.8.85@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
Aug 11 18:03:51 nbp2-oss10 kernel: LustreError: Skipped 3632 previous similar messages
Aug 11 18:03:51 nbp2-oss10 kernel: Lustre: nbp2-OST00d1: Not available for connect from 10.151.50.241@o2ib (not set up)
Aug 11 18:03:51 nbp2-oss10 kernel: Lustre: Skipped 180 previous similar messages
Aug 11 18:03:52 nbp2-oss10 kernel: LustreError: 137-5: nbp2-OST0135_UUID: not available for connect from 10.151.48.113@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
Aug 11 18:03:52 nbp2-oss10 kernel: LustreError: Skipped 6273 previous similar messages
Aug 11 18:03:52 nbp2-oss10 kernel: Lustre: nbp2-OST00d1: Not available for connect from 10.151.7.158@o2ib (not set up)
Aug 11 18:03:52 nbp2-oss10 kernel: Lustre: Skipped 402 previous similar messages
Aug 11 18:03:52 nbp2-oss10 kernel: Lustre: nbp2-OST00d1: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
Aug 11 18:03:52 nbp2-oss10 kernel: Lustre: nbp2-OST00d1: Will be in recovery for at least 2:30, or until 14452 clients reconnect

Mahmoud Hanafi added a comment - 16/Aug/17 1:07 PM - edited here is part of dmesg. The high rate of messages caused the root drive scsi device to reset. But all but one server recovered. I had to turn down printk log level down to get the last one to recover. LDISKFS-fs warning (device dm-33): ldiskfs_init_block_bitmap: Set free blocks as 32768 for group 262310 LDISKFS-fs warning (device dm-33): ldiskfs_init_block_bitmap: Set free blocks as 32768 for group 262311 LDISKFS-fs warning (device dm-33): ldiskfs_init_block_bitmap: Set free blocks as 32768 for group 262312 LDISKFS-fs warning (device dm-33): ldiskfs_init_block_bitmap: Set free blocks as 32768 for group 262313 LDISKFS-fs warning (device dm-33): ldiskfs_init_block_bitmap: Set free blocks as 32768 for group 262314 LNet: 12178:0:(lib-move.c:1487:lnet_parse_put()) Dropping PUT from 12345-10.149.2.156@o2ib313 portal 28 match 1575300167923792 offset 0 length 520: 4 LNet: 12178:0:(lib-move.c:1487:lnet_parse_put()) Skipped 978380 previous similar messages sd 0:0:1:0: attempting task abort! scmd(ffff880af433e0c0) sd 0:0:1:0: [sdb] CDB: Write(10): 2a 00 00 a0 08 08 00 00 08 00 scsi target0:0:1: handle(0x000a), sas_address(0x4433221102000000), phy(2) scsi target0:0:1: enclosure_logical_id(0x50030480198f7e01), slot(2) scsi target0:0:1: enclosure level(0x0000),connector name( ^C) sd 0:0:1:0: task abort: SUCCESS scmd(ffff880af433e0c0) sd 0:0:1:0: attempting task abort! scmd(ffff880a64ab46c0) sd 0:0:1:0: [sdb] CDB: Write(10): 2a 00 00 e0 08 08 00 00 08 00 scsi target0:0:1: handle(0x000a), sas_address(0x4433221102000000), phy(2) scsi target0:0:1: enclosure_logical_id(0x50030480198f7e01), slot(2) scsi target0:0:1: enclosure level(0x0000),connector name( ^C) sd 0:0:1:0: task abort: SUCCESS scmd(ffff880a64ab46c0) sd 0:0:1:0: attempting task abort! scmd(ffff880b21cec180) sd 0:0:1:0: [sdb] CDB: Write(10): 2a 00 00 c0 08 08 00 00 08 00 scsi target0:0:1: handle(0x000a), sas_address(0x4433221102000000), phy(2) DISKFS-fs (dm-23): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-34): mounted filesystem with ordered data mode. quota=on. Opts: mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-29): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-18): mounted filesystem with ordered data mode. quota=on. Opts: Lustre: nbp2-OST0081: Not available for connect from 10.151.43.107@o2ib (not set up) Lustre: Skipped 3 previous similar messages Lustre: nbp2-OST0081: Not available for connect from 10.151.29.130@o2ib (not set up) Lustre: Skipped 113 previous similar messages Lustre: nbp2-OST0081: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450 Lustre: nbp2-OST0081: Will be in recovery for at least 2:30, or until 14441 clients reconnect Lustre: nbp2-OST0081: Denying connection for new client 35b99837-9505-fc4d-270f-f2d1ca30372d (at 10.151.30.176@o2ib), waiting for all 14441 known clients (44 recovered, 1 in progress, and 0 evicted) to recover in 5:10 Here is /var/log/messages Aug 11 17:58:25 nbp2-oss10 kernel: LNet: 12075:0:(lib-move.c:1487:lnet_parse_put()) Dropping PUT from 12345-10.151.30.120@o2ib portal 28 match 1575477031778096 offset 0 length 520: 4 Aug 11 17:58:25 nbp2-oss10 kernel: LNet: 12075:0:(lib-move.c:1487:lnet_parse_put()) Skipped 1037319 previous similar messages Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-30): Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-28): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-31): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-18): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-21): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-19): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-22): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-20): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-26): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-33): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-23): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:35 nbp2-oss10 kernel: LDISKFS-fs (dm-32): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:40 nbp2-oss10 kernel: LDISKFS-fs (dm-34): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:40 nbp2-oss10 kernel: LDISKFS-fs (dm-24): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:40 nbp2-oss10 kernel: LDISKFS-fs (dm-25): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:40 nbp2-oss10 kernel: Aug 11 18:03:41 nbp2-oss10 kernel: LDISKFS-fs (dm-29): Aug 11 18:03:41 nbp2-oss10 kernel: LDISKFS-fs (dm-35): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:41 nbp2-oss10 kernel: mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:49 nbp2-oss10 kernel: LDISKFS-fs (dm-27): mounted filesystem with ordered data mode. quota=on. Opts: Aug 11 18:03:50 nbp2-oss10 kernel: LustreError: 137-5: nbp2-OST0009_UUID: not available for connect from 10.151.50.143@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server. Aug 11 18:03:50 nbp2-oss10 kernel: LustreError: Skipped 314 previous similar messages Aug 11 18:03:51 nbp2-oss10 kernel: Lustre: nbp2-OST00d1: Not available for connect from 10.151.9.177@o2ib (not set up) Aug 11 18:03:51 nbp2-oss10 kernel: Lustre: Skipped 11 previous similar messages Aug 11 18:03:51 nbp2-oss10 kernel: LustreError: 137-5: nbp2-OST0009_UUID: not available for connect from 10.151.8.85@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server. Aug 11 18:03:51 nbp2-oss10 kernel: LustreError: Skipped 3632 previous similar messages Aug 11 18:03:51 nbp2-oss10 kernel: Lustre: nbp2-OST00d1: Not available for connect from 10.151.50.241@o2ib (not set up) Aug 11 18:03:51 nbp2-oss10 kernel: Lustre: Skipped 180 previous similar messages Aug 11 18:03:52 nbp2-oss10 kernel: LustreError: 137-5: nbp2-OST0135_UUID: not available for connect from 10.151.48.113@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server. Aug 11 18:03:52 nbp2-oss10 kernel: LustreError: Skipped 6273 previous similar messages Aug 11 18:03:52 nbp2-oss10 kernel: Lustre: nbp2-OST00d1: Not available for connect from 10.151.7.158@o2ib (not set up) Aug 11 18:03:52 nbp2-oss10 kernel: Lustre: Skipped 402 previous similar messages Aug 11 18:03:52 nbp2-oss10 kernel: Lustre: nbp2-OST00d1: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450 Aug 11 18:03:52 nbp2-oss10 kernel: Lustre: nbp2-OST00d1: Will be in recovery for at least 2:30, or until 14452 clients reconnect

nasf (Inactive) added a comment - 16/Aug/17 11:08 AM

mhanafi
It looks different from the original one, would you please to show me more logs (dmesg, /var/log/messages) about the latest corruption ? Is the system still accessible after above warning?

nasf (Inactive) added a comment - 16/Aug/17 11:08 AM mhanafi It looks different from the original one, would you please to show me more logs (dmesg, /var/log/messages) about the latest corruption ? Is the system still accessible after above warning?

Mahmoud Hanafi added a comment - 16/Aug/17 2:20 AM

Applied the new patch. After a full fsck mounting osts resulted in this many block groups getting corrected.

----------------
service603
----------------
 4549 dm-33):

----------------
service604
----------------
 4425 dm-32):

----------------
service606
----------------
 4658 dm-29):

----------------
service610
----------------
 4631 dm-33):

----------------
service611
----------------
 4616 dm-28):

----------------
service616
----------------
 4652 dm-35):

----------------
service617
----------------
 4501 dm-21):

----------------
service619
----------------
 4657 dm-25):

We need to rate limit the warnings.

Mahmoud Hanafi added a comment - 16/Aug/17 2:20 AM Applied the new patch. After a full fsck mounting osts resulted in this many block groups getting corrected. ---------------- service603 ---------------- 4549 dm-33): ---------------- service604 ---------------- 4425 dm-32): ---------------- service606 ---------------- 4658 dm-29): ---------------- service610 ---------------- 4631 dm-33): ---------------- service611 ---------------- 4616 dm-28): ---------------- service616 ---------------- 4652 dm-35): ---------------- service617 ---------------- 4501 dm-21): ---------------- service619 ---------------- 4657 dm-25): We need to rate limit the warnings.

Mahmoud Hanafi added a comment - 15/Aug/17 7:42 PM - edited

I used systemtap to catch one of these bad groups and dump out the ldiskfs_group_desc struct.

mballoc.c:826: first_group: 274007 bg_free_blocks_count_hi: 0 bg_block_bitmap_hi: 0 bg_free_blocks_count_lo: 0
mballoc.c:826:$desc {.bg_block_bitmap_lo=328727, .bg_inode_bitmap_lo=930551, .bg_inode_table_lo=3450424, .bg_free_blocks_count_lo=0, .bg_free_inodes_count_lo=128, .bg_used_dirs_count_lo=0, .bg_flags=7, .bg_reserved=[...], .bg_itable_unused_lo=128, .bg_checksum=55256, .bg_block_bitmap_hi=0, .bg_inode_bitmap_hi=0, .bg_inode_table_hi=0, .bg_free_blocks_count_hi=0, .bg_free_inodes_count_hi=0, .bg_used_dirs_count_hi=0, .bg_itable_unused_hi=0, .bg_reserved2=[...]}

It also seem odd that dumpe2fs can produce different results for unused block groups. Sometimes it will show block_bitmap!=free_blocks and other time it will be ok.

---

in ldiskfs_valid_block_bitmap() I don't understand this

 if (LDISKFS_HAS_INCOMPAT_FEATURE(sb, LDISKFS_FEATURE_INCOMPAT_FLEX_BG)) {
 /* with FLEX_BG, the inode/block bitmaps and itable
 * blocks may not be in the group at all
 * so the bitmap validation will be skipped for those groups
 * or it has to also read the block group where the bitmaps
 * are located to verify they are set.
 */
 return 1;
 }

We have flex_bg enabled would this apply to us?

For the OST that are prone to the bitmap errors cat /proc/fs/ldiskfs/dm*/mb_groups will reproduce the errors.

Mahmoud Hanafi added a comment - 15/Aug/17 7:42 PM - edited I used systemtap to catch one of these bad groups and dump out the ldiskfs_group_desc struct. mballoc.c:826: first_group: 274007 bg_free_blocks_count_hi: 0 bg_block_bitmap_hi: 0 bg_free_blocks_count_lo: 0 mballoc.c:826:$desc {.bg_block_bitmap_lo=328727, .bg_inode_bitmap_lo=930551, .bg_inode_table_lo=3450424, .bg_free_blocks_count_lo=0, .bg_free_inodes_count_lo=128, .bg_used_dirs_count_lo=0, .bg_flags=7, .bg_reserved=[...], .bg_itable_unused_lo=128, .bg_checksum=55256, .bg_block_bitmap_hi=0, .bg_inode_bitmap_hi=0, .bg_inode_table_hi=0, .bg_free_blocks_count_hi=0, .bg_free_inodes_count_hi=0, .bg_used_dirs_count_hi=0, .bg_itable_unused_hi=0, .bg_reserved2=[...]} It also seem odd that dumpe2fs can produce different results for unused block groups. Sometimes it will show block_bitmap!=free_blocks and other time it will be ok. --- in ldiskfs_valid_block_bitmap() I don't understand this if (LDISKFS_HAS_INCOMPAT_FEATURE(sb, LDISKFS_FEATURE_INCOMPAT_FLEX_BG)) { /* with FLEX_BG, the inode/block bitmaps and itable * blocks may not be in the group at all * so the bitmap validation will be skipped for those groups * or it has to also read the block group where the bitmaps * are located to verify they are set. */ return 1; } We have flex_bg enabled would this apply to us? For the OST that are prone to the bitmap errors cat /proc/fs/ldiskfs/dm*/mb_groups will reproduce the errors.

nasf (Inactive) added a comment - 15/Aug/17 9:07 AM

Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/balloc.c, 179): ldiskfs_init_block_bitmap: #24877: init the group 270808 of total groups 583584: group_blocks 32768, free_blocks 32768, free_blocks_in_gdp 0, ret 32768

The logs shows that the ldiskfs_init_block_bitmap() initialized the bitmap, but the free blocks count in the group descriptor is still zero, that caused the subsequent ldiskfs_mb_check_ondisk_bitmap() failure. Currently, I can not say it is corruption, but more like logic issue. The patch will set the free block count based on the real free bits in the bitmap. It may be not the perfect solution, but we can try whether it can resolve your trouble or not.

nasf (Inactive) added a comment - 15/Aug/17 9:07 AM Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/balloc.c, 179): ldiskfs_init_block_bitmap: #24877: init the group 270808 of total groups 583584: group_blocks 32768, free_blocks 32768, free_blocks_in_gdp 0, ret 32768 The logs shows that the ldiskfs_init_block_bitmap() initialized the bitmap, but the free blocks count in the group descriptor is still zero, that caused the subsequent ldiskfs_mb_check_ondisk_bitmap() failure. Currently, I can not say it is corruption, but more like logic issue. The patch will set the free block count based on the real free bits in the bitmap. It may be not the perfect solution, but we can try whether it can resolve your trouble or not.

People

Assignee:: nasf (Inactive)

Reporter:: Mahmoud Hanafi

Votes:: 0 Vote for this issue

Watchers:: 13 Start watching this issue

Dates

Created:: 27/Apr/17 1:40 AM

Updated:: 22/Mar/18 5:22 PM

Resolved:: 28/Aug/17 7:05 AM