Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4410

sanityn test 40a: BUG: soft lockup - CPU#0 stuck for 67s! [ptlrpcd_0:2892]

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • None
    • Lustre 2.6.0, Lustre 2.4.2, Lustre 2.5.2, Lustre 2.5.3

    • Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/70/ (2.4.2 RC2)
      Distro/Arch: RHEL6.4/x86_64
      FSTYPE=zfs
    • 3
    • 12104

    Description

      sanityn test 40a hung and hit the following failure on one client:

      21:36:52:Lustre: DEBUG MARKER: == sanityn test 40a: pdirops: create vs others ================ 21:34:49 (1387604089)
      21:36:53:BUG: soft lockup - CPU#0 stuck for 67s! [ptlrpcd_0:2892]
      21:36:53:Modules linked in: lustre(U) obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode 8139too 8139cp mii virtio_balloon i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      21:36:53:CPU 0 
      21:36:53:Modules linked in: lustre(U) obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U)
      21:36:53:BUG: soft lockup - CPU#1 stuck for 67s! [ll_sa_4070:4079]
      21:36:53:Modules linked in: lustre(U) obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode 8139too 8139cp mii virtio_balloon i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      21:36:53:CPU 1 
      21:36:53:Modules linked in: lustre(U) obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode 8139too 8139cp mii virtio_balloon i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      21:36:53:
      21:36:53:Pid: 4079, comm: ll_sa_4070 Not tainted 2.6.32-358.23.2.el6.x86_64 #1 Red Hat KVM
      21:36:53:RIP: 0010:[<ffffffff81510aae>]  [<ffffffff81510aae>] _spin_lock+0x1e/0x30
      21:36:53:RSP: 0018:ffff88006c26bda0  EFLAGS: 00000206
      21:36:53:RAX: 0000000000000002 RBX: ffff88006c26bda0 RCX: ffff88007cfd8800
      21:36:54:RDX: 0000000000000000 RSI: ffff88006c25fec0 RDI: ffff88007a737ec0
      21:36:54:RBP: ffffffff8100bb8e R08: ffff88007d860e68 R09: 00000000fffffffe
      21:36:54:R10: 0000000000000000 R11: 0000000000000001 R12: ffff88006c26bd80
      21:36:54:R13: ffff88006d6c9000 R14: 0000000000001000 R15: 0000000000000000
      21:36:54:FS:  00007fb227702700(0000) GS:ffff880002300000(0000) knlGS:0000000000000000
      21:36:54:CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      21:36:54:CR2: 00007f7bbff64000 CR3: 000000006c183000 CR4: 00000000000006e0
      21:36:54:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      21:36:54:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      21:36:54:Process ll_sa_4070 (pid: 4079, threadinfo ffff88006c26a000, task ffff88006bd25500)
      21:36:54:Stack:
      21:36:54: ffff88006c26be10 ffffffffa0abb680 ffff88007a737bf8 ffff88006e9501c8
      21:36:54:<d> 0000000000000000 ffff88007a737b00 ffff88007caa01c0 ffff88006bf57200
      21:36:54:<d> ffff88006c26bdf0 ffff88007a7ba800 ffff88007a7ba970 ffff88007a737e80
      21:36:54:Call Trace:
      21:36:54: [<ffffffffa0abb680>] ? ll_post_statahead+0x50/0xa80 [lustre]
      21:36:55: [<ffffffffa0abf8c8>] ? ll_statahead_thread+0x268/0xfa0 [lustre]
      21:36:55: [<ffffffff81063990>] ? default_wake_function+0x0/0x20
      21:36:55: [<ffffffffa0abf660>] ? ll_statahead_thread+0x0/0xfa0 [lustre]
      21:36:55: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
      21:36:55: [<ffffffffa0abf660>] ? ll_statahead_thread+0x0/0xfa0 [lustre]
      21:36:55: [<ffffffffa0abf660>] ? ll_statahead_thread+0x0/0xfa0 [lustre]
      21:36:55: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      

      Maloo report: https://maloo.whamcloud.com/test_sets/7cca784a-6b4b-11e3-99ba-52540035b04c

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: