Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12755

CPU soft lockup on mkfs.lustre

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Upstream
    • Fix Version/s: Lustre 2.13.0, Lustre 2.12.3
    • Labels:
    • Environment:
      Red Hat 7.7 on VMware
      Red Hat 7.7 on HPE ProLiant DL380 Gen10
      Red Hat 7.7 on HPE Synergy 480 Gen10

      Description

      After successfully creating packages for Red Hat 7.7

      (e.g. lustre-2.12.57_35_g55a7e2d-1.el7.x86_64.rpm)

      I get CPU soft lockups when trying to create an MGS with LDISKFS backend.

      NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [mkfs.lustre:31220]

      More details from log:

      Sep  6 10:41:00 mgs1 kernel: Call Trace:Sep  6 10:41:00 mgs1 kernel: [<ffffffff9bd73365>] queued_spin_lock_slowpath+0xb/0xf
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9bd81ad0>] _raw_spin_lock+0x20/0x30
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b865e2e>] igrab+0x1e/0x60
      Sep  6 10:41:00 mgs1 kernel: [<ffffffffc06bd88b>] ldiskfs_quota_off+0x3b/0x130 [ldiskfs]
      Sep  6 10:41:00 mgs1 kernel: [<ffffffffc06c091d>] ldiskfs_put_super+0x4d/0x400 [ldiskfs]
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84b13d>] generic_shutdown_super+0x6d/0x100
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84b5b7>] kill_block_super+0x27/0x70
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84b91e>] deactivate_locked_super+0x4e/0x70
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b84c0a6>] deactivate_super+0x46/0x60
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b86abff>] cleanup_mnt+0x3f/0x80
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b86ac92>] __cleanup_mnt+0x12/0x20
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b6c1c0b>] task_work_run+0xbb/0xe0
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9b62cc65>] do_notify_resume+0xa5/0xc0
      Sep  6 10:41:00 mgs1 kernel: [<ffffffff9bd8c23b>] int_signal+0x12/0x17
      Sep  6 10:41:00 mgs1 kernel: Code: 47 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 66 90 b9 01 00 00 00 8b 17 85 d2 74 0d 83 fa 03 74 08 f3 90 <8b> 17 85 d2 75 f3 89 d0 f0 0f b1 0f 39 c2 75 e3 5d 66 90 c3 0f

      I also tried to go for an MDS/MGS pair on the DL380 but mkfs.lustre got stuck the same way 

      as seen on VMware.

        Attachments

          Activity

            People

            • Assignee:
              yujian Jian Yu
              Reporter:
              kazinczy Tamas Kazinczy
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: