Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.1, Lustre 2.11.0
    • Lustre 2.7.0
    • None
    • 3
    • 9223372036854775807

    Description

      We had 2 OSS and 3 different OST crash with bitmap corrupted messages.

      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs error (device dm-42): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 245659corrupted: 32768 blocks free in bitmap, 0 - in gd
      Apr  3 18:38:16 nbp1-oss6 kernel: 
      Apr  3 18:38:16 nbp1-oss6 kernel: Aborting journal on device dm-3.
      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs (dm-42): Remounting filesystem read-only
      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs error (device dm-42): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 245660corrupted: 32768 blocks free in bitmap, 0 - in gd
      
      
      

      These errors were on 2 different backend RAID devices. Note worthy  items:
      1 .The filesystem was +90% full and 1/2 of the data was deleted.
      2. OSTs are formatted with " -E packed_meta_blocks=1 "

      Attachments

        1. foreach.out
          736 kB
          Mahmoud Hanafi
        2. vmcore-dmesg.txt
          512 kB
          Mahmoud Hanafi
        3. bt.2017-07-26-02.48.00
          765 kB
          Mahmoud Hanafi
        4. bt.2017-07-26-12.08.43
          808 kB
          Mahmoud Hanafi
        5. ost258.dumpe2fs.after.readonly.gz
          34.44 MB
          Mahmoud Hanafi
        6. ost258.dumpe2fs.after.fsck.gz
          34.46 MB
          Mahmoud Hanafi
        7. mballoc.c
          145 kB
          Jay Lan
        8. syslog.gp270808.error.gz
          13.37 MB
          Mahmoud Hanafi

        Issue Links

          Activity

            People

              yong.fan nasf (Inactive)
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: