Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.1, Lustre 2.11.0
    • Lustre 2.7.0
    • None
    • 3
    • 9223372036854775807

    Description

      We had 2 OSS and 3 different OST crash with bitmap corrupted messages.

      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs error (device dm-42): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 245659corrupted: 32768 blocks free in bitmap, 0 - in gd
      Apr  3 18:38:16 nbp1-oss6 kernel: 
      Apr  3 18:38:16 nbp1-oss6 kernel: Aborting journal on device dm-3.
      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs (dm-42): Remounting filesystem read-only
      Apr  3 18:38:16 nbp1-oss6 kernel: LDISKFS-fs error (device dm-42): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 245660corrupted: 32768 blocks free in bitmap, 0 - in gd
      
      
      

      These errors were on 2 different backend RAID devices. Note worthy  items:
      1 .The filesystem was +90% full and 1/2 of the data was deleted.
      2. OSTs are formatted with " -E packed_meta_blocks=1 "

      Attachments

        1. bt.2017-07-26-02.48.00
          765 kB
        2. bt.2017-07-26-12.08.43
          808 kB
        3. foreach.out
          736 kB
        4. mballoc.c
          145 kB
        5. ost258.dumpe2fs.after.fsck.gz
          34.46 MB
        6. ost258.dumpe2fs.after.readonly.gz
          34.44 MB
        7. syslog.gp270808.error.gz
          13.37 MB
        8. vmcore-dmesg.txt
          512 kB

        Issue Links

          Activity

            [LU-9410] on-disk bitmap corrupted

            John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28765/
            Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set:
            Commit: 27f5b8b16416b04a561d0b0121860e2a5188be4a

            gerrit Gerrit Updater added a comment - John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28765/ Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled Project: fs/lustre-release Branch: b2_10 Current Patch Set: Commit: 27f5b8b16416b04a561d0b0121860e2a5188be4a

            Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28765
            Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set: 1
            Commit: 53d836b1e5d255558639fe8e4eae78a87a176d04

            gerrit Gerrit Updater added a comment - Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28765 Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled Project: fs/lustre-release Branch: b2_10 Current Patch Set: 1 Commit: 53d836b1e5d255558639fe8e4eae78a87a176d04

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28566/
            Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 5506c15a65b3eebb9f15000105e6eb7c02742a10

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28566/ Subject: LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled Project: fs/lustre-release Branch: master Current Patch Set: Commit: 5506c15a65b3eebb9f15000105e6eb7c02742a10

            Yes, master also needs the patch 28566.

            yong.fan nasf (Inactive) added a comment - Yes, master also needs the patch 28566.

            Do I need this patch for 2.10.0?

            jaylan Jay Lan (Inactive) added a comment - Do I need this patch for 2.10.0?

            I think that there may be something can be improved for mke2fs, not e2fsck.

            yong.fan nasf (Inactive) added a comment - I think that there may be something can be improved for mke2fs, not e2fsck.

            Does this patch require any changes to e2fsck?

            mhanafi Mahmoud Hanafi added a comment - Does this patch require any changes to e2fsck?

            mhanafi Thanks for the update.

            yong.fan nasf (Inactive) added a comment - mhanafi Thanks for the update.

            updated: we have applied https://review.whamcloud.com/28566 Friday and the filesystem has been stable.

            mhanafi Mahmoud Hanafi added a comment - updated: we have applied https://review.whamcloud.com/28566 Friday and the filesystem has been stable.

            Sorry I typed the patch number. I wanted to say it is stable with 28550.

            Then it is reasonable. As I explained above, 28550 may do more than the necessary fixes. But since it runs stable, you can keep it until next 'corruption'.

            yong.fan nasf (Inactive) added a comment - Sorry I typed the patch number. I wanted to say it is stable with 28550. Then it is reasonable. As I explained above, 28550 may do more than the necessary fixes. But since it runs stable, you can keep it until next 'corruption'.

            People

              yong.fan nasf (Inactive)
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: