Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.1, Lustre 2.11.0
    • Lustre 2.7.0
    • None
    • 3
    • 9223372036854775807

    Description

      We had 2 OSS and 3 different OST crash with bitmap corrupted messages.

      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs error (device dm-42): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 245659corrupted: 32768 blocks free in bitmap, 0 - in gd
      Apr  3 18:38:16 nbp1-oss6 kernel: 
      Apr  3 18:38:16 nbp1-oss6 kernel: Aborting journal on device dm-3.
      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs (dm-42): Remounting filesystem read-only
      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs error (device dm-42): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 245660corrupted: 32768 blocks free in bitmap, 0 - in gd
      
      
      

      These errors were on 2 different backend RAID devices. Note worthy  items:
      1 .The filesystem was +90% full and 1/2 of the data was deleted.
      2. OSTs are formatted with " -E packed_meta_blocks=1 "

      Attachments

        1. bt.2017-07-26-02.48.00
          765 kB
          Mahmoud Hanafi
        2. bt.2017-07-26-12.08.43
          808 kB
          Mahmoud Hanafi
        3. foreach.out
          736 kB
          Mahmoud Hanafi
        4. mballoc.c
          145 kB
          Jay Lan
        5. ost258.dumpe2fs.after.fsck.gz
          34.46 MB
          Mahmoud Hanafi
        6. ost258.dumpe2fs.after.readonly.gz
          34.44 MB
          Mahmoud Hanafi
        7. syslog.gp270808.error.gz
          13.37 MB
          Mahmoud Hanafi
        8. vmcore-dmesg.txt
          512 kB
          Mahmoud Hanafi

        Issue Links

          Activity

            [LU-9410] on-disk bitmap corrupted

            John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28765/
            Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set:
            Commit: 27f5b8b16416b04a561d0b0121860e2a5188be4a

            gerrit Gerrit Updater added a comment - John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28765/ Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled Project: fs/lustre-release Branch: b2_10 Current Patch Set: Commit: 27f5b8b16416b04a561d0b0121860e2a5188be4a

            Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28765
            Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set: 1
            Commit: 53d836b1e5d255558639fe8e4eae78a87a176d04

            gerrit Gerrit Updater added a comment - Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28765 Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled Project: fs/lustre-release Branch: b2_10 Current Patch Set: 1 Commit: 53d836b1e5d255558639fe8e4eae78a87a176d04

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28566/
            Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 5506c15a65b3eebb9f15000105e6eb7c02742a10

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28566/ Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled Project: fs/lustre-release Branch: master Current Patch Set: Commit: 5506c15a65b3eebb9f15000105e6eb7c02742a10

            Yes, master also needs the patch 28566.

            yong.fan nasf (Inactive) added a comment - Yes, master also needs the patch 28566.

            Do I need this patch for 2.10.0?

            jaylan Jay Lan (Inactive) added a comment - Do I need this patch for 2.10.0?

            I think that there may be something can be improved for mke2fs, not e2fsck.

            yong.fan nasf (Inactive) added a comment - I think that there may be something can be improved for mke2fs, not e2fsck.

            Does this patch require any changes to e2fsck?

            mhanafi Mahmoud Hanafi added a comment - Does this patch require any changes to e2fsck?

            mhanafi Thanks for the update.

            yong.fan nasf (Inactive) added a comment - mhanafi Thanks for the update.

            updated: we have applied https://review.whamcloud.com/28566 Friday and the filesystem has been stable.

            mhanafi Mahmoud Hanafi added a comment - updated: we have applied https://review.whamcloud.com/28566 Friday and the filesystem has been stable.

            Sorry I typed the patch number. I wanted to say it is stable with 28550.

            Then it is reasonable. As I explained above, 28550 may do more than the necessary fixes. But since it runs stable, you can keep it until next 'corruption'.

            yong.fan nasf (Inactive) added a comment - Sorry I typed the patch number. I wanted to say it is stable with 28550. Then it is reasonable. As I explained above, 28550 may do more than the necessary fixes. But since it runs stable, you can keep it until next 'corruption'.

            People

              yong.fan nasf (Inactive)
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: