Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7265

replay-single test_70b timeout: NULL pointer dereference in __mutex_lock_slowpath

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.8.0
    • None
    • review-dne-part-2 in autotest
    • 3
    • 9223372036854775807

    Description

      replay-single test 70b hangs with

      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      

      Logs are at https://testing.hpdd.intel.com/test_sets/5b20699c-6c8f-11e5-87fb-5254006e85c2

      From the MDS2, MDS3, MDS4 console:

      14:57:08:BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      14:57:08:IP: [<ffffffff8153a534>] __mutex_lock_slowpath+0x64/0x210
      14:57:08:PGD 0 
      14:57:08:Oops: 0000 [#1] SMP 
      14:57:08:last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/virtio0/block/vda/queue/scheduler
      14:57:09:CPU 1 
      14:57:09:Modules linked in: osp(U) mdd(U) lod(U) mdt(U) lfsck(U) mgc(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic libcfs(U) ldiskfs(U) jbd2 nfs fscache nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs autofs4 ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 microcode serio_raw virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk pata_acpi ata_generic ata_piix virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      14:57:09:
      14:57:09:Pid: 20902, comm: ldlm_bl_02 Not tainted 2.6.32-573.3.1.el6_lustre.g00880a0.x86_64 #1 Red Hat KVM
      14:57:09:RIP: 0010:[<ffffffff8153a534>]  [<ffffffff8153a534>] __mutex_lock_slowpath+0x64/0x210
      14:57:09:RSP: 0000:ffff88007958bbe0  EFLAGS: 00010213
      14:57:09:RAX: 00000000ffffffff RBX: ffff88007ac82730 RCX: 0000000000000000
      14:57:09:RDX: 0000000000000000 RSI: ffff88007ac82740 RDI: ffff88007ac82734
      14:57:09:RBP: ffff88007958bc40 R08: 000000000000000a R09: 00000000ffffffff
      14:57:09:R10: 00000000ffffffff R11: 00000000ffffffff R12: ffff88006fc92ab0
      14:57:09:R13: ffff88007ac82734 R14: ffff88007958bbf0 R15: ffff88007ac82738
      14:57:09:FS:  0000000000000000(0000) GS:ffff880002300000(0000) knlGS:0000000000000000
      14:57:09:CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      14:57:09:CR2: 0000000000000008 CR3: 0000000001a8d000 CR4: 00000000000006e0
      14:57:10:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      14:57:10:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      14:57:10:Process ldlm_bl_02 (pid: 20902, threadinfo ffff880079588000, task ffff88006fc92ab0)
      14:57:10:Stack:
      14:57:10: ffff88007958bfd8 ffff88007ac82740 ffff88007958bea0 0000000000000000
      14:57:10:<d> ffff88007958bc60 ffffffffa04b2b61 0000657200000010 ffff88007ac82730
      14:57:10:<d> ffff88007ac82680 0011408400000000 ffff88007958bea0 0000000000000000
      14:57:10:Call Trace:
      14:57:10: [<ffffffffa04b2b61>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      14:57:10: [<ffffffff8153a08b>] mutex_lock+0x2b/0x50
      14:57:10: [<ffffffffa0da7846>] mgc_blocking_ast+0x536/0x810 [mgc]
      14:57:10: [<ffffffffa07d2b57>] ldlm_cancel_callback+0x87/0x280 [ptlrpc]
      14:57:10: [<ffffffffa07f148a>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
      14:57:10: [<ffffffffa07f60fc>] ldlm_cli_cancel+0x9c/0x3e0 [ptlrpc]
      14:57:10: [<ffffffffa0da73fb>] mgc_blocking_ast+0xeb/0x810 [mgc]
      14:57:10: [<ffffffffa0da7310>] ? mgc_blocking_ast+0x0/0x810 [mgc]
      14:57:10: [<ffffffffa07fa7e0>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
      14:57:10: [<ffffffffa07fb6f4>] ldlm_bl_thread_main+0x484/0x700 [ptlrpc]
      14:57:10: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
      14:57:10: [<ffffffffa07fb270>] ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc]
      14:57:10: [<ffffffff810a101e>] kthread+0x9e/0xc0
      14:57:11: [<ffffffff8100c28a>] child_rip+0xa/0x20
      14:57:11: [<ffffffff810a0f80>] ? kthread+0x0/0xc0
      14:57:11: [<ffffffff8100c280>] ? child_rip+0x0/0x20
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: