Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.13.0, Lustre 2.12.3
    • Upstream
    • Red Hat 7.7 on VMware
      Red Hat 7.7 on HPE ProLiant DL380 Gen10
      Red Hat 7.7 on HPE Synergy 480 Gen10

    Description

      After successfully creating packages for Red Hat 7.7

      (e.g. lustre-2.12.57_35_g55a7e2d-1.el7.x86_64.rpm)

      I get CPU soft lockups when trying to create an MGS with LDISKFS backend.

      NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [mkfs.lustre:31220]

      More details from log:

      Sep  6 10:41:00 mgs1 kernel: Call Trace:Sep  6 10:41:00 mgs1 kernel: [<ffffffff9bd73365>] queued_spin_lock_slowpath+0xb/0xf
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9bd81ad0>] _raw_spin_lock+0x20/0x30
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b865e2e>] igrab+0x1e/0x60
      Sep  6 10:41:00 mgs1 kernel: [<ffffffffc06bd88b>] ldiskfs_quota_off+0x3b/0x130 [ldiskfs]
      Sep  6 10:41:00 mgs1 kernel: [<ffffffffc06c091d>] ldiskfs_put_super+0x4d/0x400 [ldiskfs]
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84b13d>] generic_shutdown_super+0x6d/0x100
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84b5b7>] kill_block_super+0x27/0x70
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84b91e>] deactivate_locked_super+0x4e/0x70
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84c0a6>] deactivate_super+0x46/0x60
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b86abff>] cleanup_mnt+0x3f/0x80
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b86ac92>] __cleanup_mnt+0x12/0x20
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b6c1c0b>] task_work_run+0xbb/0xe0
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b62cc65>] do_notify_resume+0xa5/0xc0
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9bd8c23b>] int_signal+0x12/0x17
      Sep  6 10:41:00 mgs1 kernel: Code: 47 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 66 90 b9 01 00 00 00 8b 17 85 d2 74 0d 83 fa 03 74 08 f3 90 <8b> 17 85 d2 75 f3 89 d0 f0 0f b1 0f 39 c2 75 e3 5d 66 90 c3 0f

      I also tried to go for an MDS/MGS pair on the DL380 but mkfs.lustre got stuck the same way 

      as seen on VMware.

      Attachments

        Activity

          [LU-12755] CPU soft lockup on mkfs.lustre
          pjones Peter Jones added a comment -

          Great. Thanks for confirming kazinczy

          pjones Peter Jones added a comment - Great. Thanks for confirming kazinczy

          I can confirm that mkfs.lustre for patchless LDISKFS from b2_12 works fine for me now.

          kazinczy Tamas Kazinczy (Inactive) added a comment - I can confirm that mkfs.lustre for patchless LDISKFS from b2_12 works fine for me now.
          pjones Peter Jones made changes -
          Link Original: This issue is related to JFC-27 [ JFC-27 ]
          pjones Peter Jones made changes -
          Labels Original: LTS12 hang mkfs.lustre New: hang mkfs.lustre
          pjones Peter Jones made changes -
          Fix Version/s New: Lustre 2.12.3 [ 14418 ]

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36270/
          Subject: LU-12755 ldiskfs: fix project quota unpon unpatched kernel
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set:
          Commit: 820e374624a584ec0c0a326ec96bf0abeb50cf40

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36270/ Subject: LU-12755 ldiskfs: fix project quota unpon unpatched kernel Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 820e374624a584ec0c0a326ec96bf0abeb50cf40
          pjones Peter Jones made changes -
          Labels Original: hang mkfs.lustre New: LTS12 hang mkfs.lustre
          pjones Peter Jones made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]
          pjones Peter Jones added a comment -

          Landed for 2.13

          pjones Peter Jones added a comment - Landed for 2.13

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36203/
          Subject: LU-12755 ldiskfs: fix project quota unpon unpatched kernel
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: d780f15a2d63c8bde5ae6345aed85b4b44904fb5

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36203/ Subject: LU-12755 ldiskfs: fix project quota unpon unpatched kernel Project: fs/lustre-release Branch: master Current Patch Set: Commit: d780f15a2d63c8bde5ae6345aed85b4b44904fb5

          People

            yujian Jian Yu
            kazinczy Tamas Kazinczy (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: