Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.1, Lustre 2.11.0
    • Lustre 2.7.0
    • None
    • 3
    • 9223372036854775807

    Description

      We had 2 OSS and 3 different OST crash with bitmap corrupted messages.

      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs error (device dm-42): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 245659corrupted: 32768 blocks free in bitmap, 0 - in gd
      Apr  3 18:38:16 nbp1-oss6 kernel: 
      Apr  3 18:38:16 nbp1-oss6 kernel: Aborting journal on device dm-3.
      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs (dm-42): Remounting filesystem read-only
      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs error (device dm-42): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 245660corrupted: 32768 blocks free in bitmap, 0 - in gd
      
      
      

      These errors were on 2 different backend RAID devices. Note worthy  items:
      1 .The filesystem was +90% full and 1/2 of the data was deleted.
      2. OSTs are formatted with " -E packed_meta_blocks=1 "

      Attachments

        1. bt.2017-07-26-02.48.00
          765 kB
        2. bt.2017-07-26-12.08.43
          808 kB
        3. foreach.out
          736 kB
        4. mballoc.c
          145 kB
        5. ost258.dumpe2fs.after.fsck.gz
          34.46 MB
        6. ost258.dumpe2fs.after.readonly.gz
          34.44 MB
        7. syslog.gp270808.error.gz
          13.37 MB
        8. vmcore-dmesg.txt
          512 kB

        Issue Links

          Activity

            [LU-9410] on-disk bitmap corrupted

            Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/balloc.c, 179): ldiskfs_init_block_bitmap: #24877: init the group 270808 of total groups 583584: group_blocks 32768, free_blocks 32768, free_blocks_in_gdp 0, ret 32768

            The logs shows that the ldiskfs_init_block_bitmap() initialized the bitmap, but the free blocks count in the group descriptor is still zero, that caused the subsequent ldiskfs_mb_check_ondisk_bitmap() failure. Currently, I can not say it is corruption, but more like logic issue. The patch will set the free block count based on the real free bits in the bitmap. It may be not the perfect solution, but we can try whether it can resolve your trouble or not.

            yong.fan nasf (Inactive) added a comment - Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/balloc.c, 179): ldiskfs_init_block_bitmap: #24877: init the group 270808 of total groups 583584: group_blocks 32768, free_blocks 32768, free_blocks_in_gdp 0, ret 32768 The logs shows that the ldiskfs_init_block_bitmap() initialized the bitmap, but the free blocks count in the group descriptor is still zero, that caused the subsequent ldiskfs_mb_check_ondisk_bitmap() failure. Currently, I can not say it is corruption, but more like logic issue. The patch will set the free block count based on the real free bits in the bitmap. It may be not the perfect solution, but we can try whether it can resolve your trouble or not.

            Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/28550
            Subject: LU-9410 ldiskfs: handle unmatched bitmap
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 0a4199ad21c5ac23a4a4e7e07847610ad8ec7994

            gerrit Gerrit Updater added a comment - Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/28550 Subject: LU-9410 ldiskfs: handle unmatched bitmap Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0a4199ad21c5ac23a4a4e7e07847610ad8ec7994
            mhanafi Mahmoud Hanafi added a comment - - edited

            Got block group debug logs with corruption. Block group is #270808. I will attach full log file to the case. syslog.gp270808.error.gz

            Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: load group 270808
            Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1032): ldiskfs_mb_init_group: init group 270808
            Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/balloc.c, 179): ldiskfs_init_block_bitmap: #24877: init the group 270808 of total groups 583584: group_blocks 32768, free_blocks 32768, free_blocks_in_gdp 0, ret 32768
            Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 927): ldiskfs_mb_init_cache: put bitmap for group 270808 in page 541616/0
            Aug 14 18:37:14 nbp2-oss20 kernel: on-disk bitmap for group 270808 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 14 18:37:14 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 270808
            Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: load group 270808
            Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1032): ldiskfs_mb_init_group: init group 270808
            Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 927): ldiskfs_mb_init_cache: put bitmap for group 270808 in page 541616/0
            Aug 14 18:37:14 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 270808 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 14 18:37:14 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 270808
            Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: load group 270808
            Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1032): ldiskfs_mb_init_group: init group 270808
            Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 927): ldiskfs_mb_init_cache: put bitmap for group 270808 in page 541616/0
            Aug 14 18:37:15 nbp2-oss20 kernel: on-disk bitmap for group 270808 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 14 18:37:15 nbp2-oss20 kernel: Error in loading buddy information for 270808
            Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: load group 270808
            Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1032): ldiskfs_mb_init_group: init group 270808
            Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 927): ldiskfs_mb_init_cache: put bitmap for group 270808 in page 541616/0
            Aug 14 18:37:15 nbp2-oss20 kernel: on-disk bitmap for group 270808 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: Error in loading buddy information for 270808
            Aug 14 18:37:17 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: load group 270808
            Aug 14 18:37:17 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1032): ldiskfs_mb_init_group: init group 270808
            Aug 14 18:37:17 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 927): ldiskfs_mb_init_cache: put bitmap for group 270808 in page 541616/0
            
            

             

             

            mhanafi Mahmoud Hanafi added a comment - - edited Got block group debug logs with corruption. Block group is #270808. I will attach full log file to the case. syslog.gp270808.error.gz Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: load group 270808 Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1032): ldiskfs_mb_init_group: init group 270808 Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/balloc.c, 179): ldiskfs_init_block_bitmap: #24877: init the group 270808 of total groups 583584: group_blocks 32768, free_blocks 32768, free_blocks_in_gdp 0, ret 32768 Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 927): ldiskfs_mb_init_cache: put bitmap for group 270808 in page 541616/0 Aug 14 18:37:14 nbp2-oss20 kernel: on-disk bitmap for group 270808 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 14 18:37:14 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 270808 Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: load group 270808 Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1032): ldiskfs_mb_init_group: init group 270808 Aug 14 18:37:14 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 927): ldiskfs_mb_init_cache: put bitmap for group 270808 in page 541616/0 Aug 14 18:37:14 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 270808 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 14 18:37:14 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 270808 Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: load group 270808 Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1032): ldiskfs_mb_init_group: init group 270808 Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 927): ldiskfs_mb_init_cache: put bitmap for group 270808 in page 541616/0 Aug 14 18:37:15 nbp2-oss20 kernel: on-disk bitmap for group 270808 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 14 18:37:15 nbp2-oss20 kernel: Error in loading buddy information for 270808 Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: load group 270808 Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1032): ldiskfs_mb_init_group: init group 270808 Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 927): ldiskfs_mb_init_cache: put bitmap for group 270808 in page 541616/0 Aug 14 18:37:15 nbp2-oss20 kernel: on-disk bitmap for group 270808 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 14 18:37:15 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: Error in loading buddy information for 270808 Aug 14 18:37:17 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1103): ldiskfs_mb_load_buddy: load group 270808 Aug 14 18:37:17 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 1032): ldiskfs_mb_init_group: init group 270808 Aug 14 18:37:17 nbp2-oss20 kernel: (/tmp/rpmbuild-lustre-jlan-PYDDD1xV/BUILD/lustre-2.7.3/ldiskfs/mballoc.c, 927): ldiskfs_mb_init_cache: put bitmap for group 270808 in page 541616/0    
            mhanafi Mahmoud Hanafi added a comment - - edited

            With the new build are we suppose to have mballoc-debug in /proc or /sys?

            because the find doesn't find anything.

             

            Never mind I figured this out. We need to mount debugfs for it to show up.

            mhanafi Mahmoud Hanafi added a comment - - edited With the new build are we suppose to have mballoc-debug in /proc or /sys? because the find doesn't find anything.   Never mind I figured this out. We need to mount debugfs for it to show up.

            LU-7114 will allow the system to go ahead without failure right away when found corrupted bitmap, but the corruption is still there. I would suggest to apply the patch https://review.whamcloud.com/#/c/28489/, it will give us more information the mb operations trace.

            yong.fan nasf (Inactive) added a comment - LU-7114 will allow the system to go ahead without failure right away when found corrupted bitmap, but the corruption is still there. I would suggest to apply the patch https://review.whamcloud.com/#/c/28489/ , it will give us more information the mb operations trace.

            So haven't put patch debug 28489 in place but are now running with "LU-7114" patch. It already has found bitmap errors.

            ug 12 01:05:43 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:43 nbp2-oss20 kernel: 
            Aug 12 01:05:43 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:43 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:43 nbp2-oss20 kernel: 
            Aug 12 01:05:43 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:44 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:45 nbp2-oss20 kernel: 
            Aug 12 01:05:45 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:45 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:45 nbp2-oss20 kernel: 
            Aug 12 01:05:45 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:46 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:47 nbp2-oss20 kernel: 
            Aug 12 01:05:47 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:47 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:47 nbp2-oss20 kernel: 
            Aug 12 01:05:47 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:49 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:50 nbp2-oss20 kernel: 
            Aug 12 01:05:50 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:50 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:50 nbp2-oss20 kernel: 
            Aug 12 01:05:50 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:53 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:54 nbp2-oss20 kernel: 
            Aug 12 01:05:54 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:54 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:54 nbp2-oss20 kernel: 
            Aug 12 01:05:54 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:59 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:59 nbp2-oss20 kernel: 
            Aug 12 01:05:59 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:05:59 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:05:59 nbp2-oss20 kernel: 
            Aug 12 01:05:59 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:06:05 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:06:05 nbp2-oss20 kernel: 
            Aug 12 01:06:05 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:06:05 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 01:06:05 nbp2-oss20 kernel: 
            Aug 12 01:06:05 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790
            Aug 12 01:06:12 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd
            
            
            
            
            

            Some time later

            Aug 12 04:05:12 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 276684 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 04:05:12 nbp2-oss20 kernel: 
            Aug 12 04:05:12 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 276685 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 04:07:56 nbp2-oss20 pcp-pmie[5801]: High 1-minute load average 354load@nbp2-oss20
            Aug 12 04:07:56 nbp2-oss20 - in gd
            Aug 12 04:07:56 nbp2-oss20 kernel: 
            Aug 12 04:07:56 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 304861 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 04:07:56 nbp2-oss20 kernel: 
            Aug 12 04:07:56 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 304862 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 04:07:56 nbp2-oss20 kernel: 
            Aug 12 04:07:56 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 304863 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 04:07:56 nbp2-oss20 kernel: 
            Aug 12 04:07:56 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 304864 corrupted: 32768 blocks free in bitmap, 0 - in gd
            Aug 12 04:07:56 nbp2-oss20 kernel: 
            .....
            

            It has marked 6727 uniq groups as bad for dm-21(ost319)

             

            mhanafi Mahmoud Hanafi added a comment - So haven't put patch debug 28489 in place but are now running with " LU-7114 " patch. It already has found bitmap errors. ug 12 01:05:43 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:43 nbp2-oss20 kernel: Aug 12 01:05:43 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:43 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:43 nbp2-oss20 kernel: Aug 12 01:05:43 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:44 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:45 nbp2-oss20 kernel: Aug 12 01:05:45 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:45 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:45 nbp2-oss20 kernel: Aug 12 01:05:45 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:46 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:47 nbp2-oss20 kernel: Aug 12 01:05:47 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:47 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:47 nbp2-oss20 kernel: Aug 12 01:05:47 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:49 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:50 nbp2-oss20 kernel: Aug 12 01:05:50 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:50 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:50 nbp2-oss20 kernel: Aug 12 01:05:50 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:53 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:54 nbp2-oss20 kernel: Aug 12 01:05:54 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:54 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:54 nbp2-oss20 kernel: Aug 12 01:05:54 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:59 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:59 nbp2-oss20 kernel: Aug 12 01:05:59 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:05:59 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:05:59 nbp2-oss20 kernel: Aug 12 01:05:59 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:06:05 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:06:05 nbp2-oss20 kernel: Aug 12 01:06:05 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:06:05 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 01:06:05 nbp2-oss20 kernel: Aug 12 01:06:05 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_load_buddy: Error in loading buddy information for 275790 Aug 12 01:06:12 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 275790 corrupted: 32768 blocks free in bitmap, 0 - in gd Some time later Aug 12 04:05:12 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 276684 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 04:05:12 nbp2-oss20 kernel: Aug 12 04:05:12 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 276685 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 04:07:56 nbp2-oss20 pcp-pmie[5801]: High 1-minute load average 354load@nbp2-oss20 Aug 12 04:07:56 nbp2-oss20 - in gd Aug 12 04:07:56 nbp2-oss20 kernel: Aug 12 04:07:56 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 304861 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 04:07:56 nbp2-oss20 kernel: Aug 12 04:07:56 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 304862 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 04:07:56 nbp2-oss20 kernel: Aug 12 04:07:56 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 304863 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 04:07:56 nbp2-oss20 kernel: Aug 12 04:07:56 nbp2-oss20 kernel: LDISKFS-fs warning (device dm-21): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 304864 corrupted: 32768 blocks free in bitmap, 0 - in gd Aug 12 04:07:56 nbp2-oss20 kernel: ..... It has marked 6727 uniq groups as bad for dm-21(ost319)  

            https://review.whamcloud.com/28489 is refreshed, please try again. Thanks!

            yong.fan nasf (Inactive) added a comment - https://review.whamcloud.com/28489 is refreshed, please try again. Thanks!

            mballoc.c attached.

            jaylan Jay Lan (Inactive) added a comment - mballoc.c attached.

            Please attach the source file ldiskfs/mballoc.c, you can find it in your compile directory. Thanks!

            yong.fan nasf (Inactive) added a comment - Please attach the source file ldiskfs/mballoc.c, you can find it in your compile directory. Thanks!

            find /proc /sys -name mballoc-debug

            has not output

            mhanafi Mahmoud Hanafi added a comment - find /proc /sys -name mballoc-debug has not output

            What is the output with ldiskfs.ko insmod:

            find /proc /sys -name mballoc-debug
            
            yong.fan nasf (Inactive) added a comment - What is the output with ldiskfs.ko insmod: find /proc /sys -name mballoc-debug

            People

              yong.fan nasf (Inactive)
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: