Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8592

MDS crashed with ASSERTION( atomic_read(&o->lo_header->loh_ref) > 0 )

Details

    • 3
    • 9223372036854775807

    Description

      Error happened during soak testing of build '20160902' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160902)
      Configuration reads as:
      4 MDS with 1 MDT / MDS, formatted with ldiskfs and configured pairwise in active-active HA configuration
      6 OSS with 4 OST / OSS formatted with ldiskfs and configured pairwise in active-active HA configuration
      DNE is enabled

      Sequence of events

      • 2016-09-06 02:51:28,201:fsmgmt.fsmgmt:INFO triggering fault mds_failover ( lola-8 (mdt-0) --> lola-9)
      • 2016-09-06 03:41:33,479:fsmgmt.fsmgmt:INFO mds_failover just completed (mdt-0 failed back to lola-8)
      • 2016-09-06 03:41:17 MDS lola-11 crashed with error message:
        <0>LustreError: 6666:0:(lu_object.h:716:lu_object_get()) ASSERTION( atomic_read(&o->lo_header->loh_ref) > 0 ) failed: 
        <0>LustreError: 6666:0:(lu_object.h:716:lu_object_get()) LBUG
        <4>Pid: 6666, comm: mdt03_002
        <4>
        <4>Call Trace:
        <4> [<ffffffffa07f0875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
        <4> [<ffffffffa07f0e77>] lbug_with_loc+0x47/0xb0 [libcfs]
        <4> [<ffffffffa1203071>] mdt_remote_object_lock+0x491/0x4a0 [mdt]
        <4> [<ffffffffa12298a0>] mdt_reint_open+0x2b90/0x3180 [mdt]
        <4> [<ffffffffa1211ead>] mdt_reint_rec+0x5d/0x200 [mdt]
        <4> [<ffffffffa11fd5db>] mdt_reint_internal+0x62b/0xa50 [mdt]
        <4> [<ffffffffa11fdbf6>] mdt_intent_reint+0x1f6/0x440 [mdt]
        <4> [<ffffffffa11fb8be>] mdt_intent_policy+0x4be/0xc70 [mdt]
        <4> [<ffffffffa0ab77c7>] ldlm_lock_enqueue+0x127/0x990 [ptlrpc]
        <4> [<ffffffffa0ae2c27>] ldlm_handle_enqueue0+0x807/0x14d0 [ptlrpc]
        <4> [<ffffffffa0b68b21>] tgt_enqueue+0x61/0x230 [ptlrpc]
        <4> [<ffffffffa0b69ccc>] tgt_request_handle+0x8ec/0x1440 [ptlrpc]
        <4> [<ffffffffa0b16501>] ptlrpc_main+0xd31/0x1800 [ptlrpc]
        <4> [<ffffffffa0b157d0>] ? ptlrpc_main+0x0/0x1800 [ptlrpc]
        <4> [<ffffffff810a138e>] kthread+0x9e/0xc0
        <4> [<ffffffff8100c28a>] child_rip+0xa/0x20
        <4> [<ffffffff810a12f0>] ? kthread+0x0/0xc0
        <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20
        <4>
        <0>Kernel panic - not syncing: LBUG
        <4>Pid: 6666, comm: mdt03_002 Tainted: P           -- ------------    2.6.32-573.26.1.el6_lustre.x86_64 #1
        <4>Call Trace:
        <4> [<ffffffff81539407>] ? panic+0xa7/0x16f
        <4> [<ffffffffa07f0ecb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
        <4> [<ffffffffa1203071>] ? mdt_remote_object_lock+0x491/0x4a0 [mdt]
        <4> [<ffffffffa12298a0>] ? mdt_reint_open+0x2b90/0x3180 [mdt]
        <4> [<ffffffffa1211ead>] ? mdt_reint_rec+0x5d/0x200 [mdt]
        <4> [<ffffffffa11fd5db>] ? mdt_reint_internal+0x62b/0xa50 [mdt]
        <4> [<ffffffffa11fdbf6>] ? mdt_intent_reint+0x1f6/0x440 [mdt]
        <4> [<ffffffffa11fb8be>] ? mdt_intent_policy+0x4be/0xc70 [mdt]
        <4> [<ffffffffa0ab77c7>] ? ldlm_lock_enqueue+0x127/0x990 [ptlrpc]
        <4> [<ffffffffa0ae2c27>] ? ldlm_handle_enqueue0+0x807/0x14d0 [ptlrpc]
        <4> [<ffffffffa0b68b21>] ? tgt_enqueue+0x61/0x230 [ptlrpc]
        <4> [<ffffffffa0b69ccc>] ? tgt_request_handle+0x8ec/0x1440 [ptlrpc]
        <4> [<ffffffffa0b16501>] ? ptlrpc_main+0xd31/0x1800 [ptlrpc]
        <4> [<ffffffffa0b157d0>] ? ptlrpc_main+0x0/0x1800 [ptlrpc]
        <4> [<ffffffff810a138e>] ? kthread+0x9e/0xc0
        <4> [<ffffffff8100c28a>] ? child_rip+0xa/0x20
        <4> [<ffffffff810a12f0>] ? kthread+0x0/0xc0
        <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20
        

        Attached files:
        console, message logs of all MDS nodes; vmcore-dmesg.txt of lola-11.
        crash dump is available.

      Attachments

        1. console-lola-10.log.bz2
          54 kB
        2. console-lola-11.log.bz2
          54 kB
        3. console-lola-8.log.bz2
          94 kB
        4. console-lola-9.log.bz2
          63 kB
        5. lola-11-vmcore-dmesg.txt.bz2
          29 kB
        6. message-lola-10.log.bz2
          188 kB
        7. message-lola-11.log.bz2
          183 kB
        8. message-lola-8.log.bz2
          398 kB
        9. message-lola-9.log.bz2
          405 kB

        Activity

          People

            yong.fan nasf (Inactive)
            heckes Frank Heckes (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: