Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3474

MDS LBUG on unlink?

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Blocker Blocker
    • Lustre 2.4.1, Lustre 2.5.0
    • Lustre 2.4.0, Lustre 2.5.0
    • 3
    • 8703

      Hi,

      We have been testing v2.4 and have hit this LBUG which we have never experienced in v1.8.x for similar workloads. It looks like it is related to do an rm/unlink on certain files. I had to abort recovery and stop the ongoing file deletion in order to keep the MDS from repeatedly crashing with the same LBUG. We can supply more debug info should you need it.

      Cheers,

      Daire

      <0>LustreError: 6274:0:(linkea.c:169:linkea_links_find()) ASSERTION( ldata->ld_leh != ((void *)0) ) failed:
      <0>LustreError: 6274:0:(linkea.c:169:linkea_links_find()) LBUG
      <4>Pid: 6274, comm: mdt01_004
      <4>
      <4>Call Trace:
      <4> [<ffffffffa044b895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      <4> [<ffffffffa044be97>] lbug_with_loc+0x47/0xb0 [libcfs]
      <4> [<ffffffffa05b47d6>] linkea_links_find+0x186/0x190 [obdclass]
      <4> [<ffffffffa0b87206>] ? mdo_xattr_get+0x26/0x30 [mdd]
      <4> [<ffffffffa0b8a645>] mdd_linkea_prepare+0x95/0x430 [mdd]
      <4> [<ffffffffa0b8ab01>] mdd_links_rename+0x121/0x540 [mdd]
      <4> [<ffffffffa0b8eae6>] mdd_unlink+0xb86/0xe30 [mdd]
      <4> [<ffffffffa0e0db98>] mdo_unlink+0x18/0x50 [mdt]
      <4> [<ffffffffa0e10f40>] mdt_reint_unlink+0x820/0x1010 [mdt]
      <4> [<ffffffffa0e0d891>] mdt_reint_rec+0x41/0xe0 [mdt]
      <4> [<ffffffffa0df2b03>] mdt_reint_internal+0x4c3/0x780 [mdt]
      <4> [<ffffffffa0df2e04>] mdt_reint+0x44/0xe0 [mdt]
      <4> [<ffffffffa0df7ab8>] mdt_handle_common+0x648/0x1660 [mdt]
      <4> [<ffffffffa0e31165>] mds_regular_handle+0x15/0x20 [mdt]
      <4> [<ffffffffa0730388>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
      <4> [<ffffffffa044c5de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      <4> [<ffffffffa045dd8f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
      <4> [<ffffffffa07276e9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
      <4> [<ffffffff81055ab3>] ? __wake_up+0x53/0x70
      <4> [<ffffffffa073171e>] ptlrpc_main+0xace/0x1700 [ptlrpc]
      <4> [<ffffffffa0730c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      <4> [<ffffffff8100c0ca>] child_rip+0xa/0x20
      <4> [<ffffffffa0730c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      <4> [<ffffffffa0730c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      <4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      <4>
      <0>Kernel panic - not syncing: LBUG
      <4>Pid: 6274, comm: mdt01_004 Tainted: G --------------- T 2.6.32-358.6.2.el6_lustre.g230b174.x86_64 #1
      <4>Call Trace:
      <4> [<ffffffff8150d878>] ? panic+0xa7/0x16f
      <4> [<ffffffffa044beeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      <4> [<ffffffffa05b47d6>] ? linkea_links_find+0x186/0x190 [obdclass]
      <4> [<ffffffffa0b87206>] ? mdo_xattr_get+0x26/0x30 [mdd]
      <4> [<ffffffffa0b8a645>] ? mdd_linkea_prepare+0x95/0x430 [mdd]
      <4> [<ffffffffa0b8ab01>] ? mdd_links_rename+0x121/0x540 [mdd]
      <4> [<ffffffffa0b8eae6>] ? mdd_unlink+0xb86/0xe30 [mdd]
      <4> [<ffffffffa0e0db98>] ? mdo_unlink+0x18/0x50 [mdt]
      <4> [<ffffffffa0e10f40>] ? mdt_reint_unlink+0x820/0x1010 [mdt]
      <4> [<ffffffffa0e0d891>] ? mdt_reint_rec+0x41/0xe0 [mdt]
      <4> [<ffffffffa0df2b03>] ? mdt_reint_internal+0x4c3/0x780 [mdt]
      <4> [<ffffffffa0df2e04>] ? mdt_reint+0x44/0xe0 [mdt]
      <4> [<ffffffffa0df7ab8>] ? mdt_handle_common+0x648/0x1660 [mdt]
      <4> [<ffffffffa0e31165>] ? mds_regular_handle+0x15/0x20 [mdt]
      <4> [<ffffffffa0730388>] ? ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
      <4> [<ffffffffa044c5de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      <4> [<ffffffffa045dd8f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
      <4> [<ffffffffa07276e9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
      <4> [<ffffffff81055ab3>] ? __wake_up+0x53/0x70
      <4> [<ffffffffa073171e>] ? ptlrpc_main+0xace/0x1700 [ptlrpc]
      <4> [<ffffffffa0730c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      <4> [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
      <4> [<ffffffffa0730c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      <4> [<ffffffffa0730c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      <4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20

            bfaccini Bruno Faccini (Inactive)
            daire Daire Byrne (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated:
              Resolved: