Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10837

no bitmap check if block bitmap is uninitialized

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      Following commit:

      LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled

      Tried to expose a problem that we should be careful with block bitmap uninitialized.

      However, it only tried to skip 0 in @free_gdp, at some customer site, we observed

      cases

      Mar 13 13:59:29 oss2c105 kernel: LDISKFS-fs error (device sfa0002): ldiskfs_mb_check_ondisk_bitmap:3605: comm ll_ost_io01_057: on-disk bitmap for group 420934corrupted: 32768 blocks free in bitmap, 559 - in gd\x0a
      Mar 13 13:59:29 oss2c105 kernel: LDISKFS-fs error (device sfa0002): ldiskfs_mb_check_ondisk_bitmap:3605: comm ll_ost_io00_056: on-disk bitmap for group 420938corrupted: 32768 blocks free in bitmap, 680 - in gd\x0a
      
      

      And also, I did a very simple reformat and tried to dumpe2fs OST:

      ...
      
      Group 158: (Blocks 5177344-5210111) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
      
        Checksum 0x0617, unused inodes 1920
      
        Block bitmap at 1182 (bg #0 + 1182), Inode bitmap at 1342 (bg #0 + 1342)
      
        Inode table at 40512-40751 (bg #1 + 7744)
      
        32768 free blocks, 1920 free inodes, 0 directories, 1920 unused inodes
      
        Free blocks: 5177344-5210111
      
        Free inodes: 303361-305280
      
      Group 159: (Blocks 5210112-5242
      
      ....
      
      

      With more check, I did not see any 0 counters...

      See comments in ext4_free_clusters_after_init:

          /* Return the number of free blocks in a block group.  It is used when

           * the block bitmap is uninitialized, so we can't just count the bits

           * in the bitmap. */

          So extra check we enhanced here is wrong if this block group

          bitmap is uninitialized, since we only check bitmaps here.

       

          Further, Looking at EXT4_BG_BLOCK_UNINIT clear codes, Kernel

          will reinit free_clusters_count when tried to clear the flag, so

          extra check for uninited block bitmaps dosen't make much sense.

       

          Let's skip uninited block bitmap check if EXT4_BG_BLOCK_UNINIT

          is set, whatever free count group desc recorded is untrustable somehow.

      Attachments

        Issue Links

          Activity

            People

              wangshilong Wang Shilong (Inactive)
              wangshilong Wang Shilong (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: