Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3430

SWL failure: (hash.c:546:cfs_hash_bd_del_locked()) ASSERTION( bd->bd_bucket->hsb_count > 0 ) failed:

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.5.0, Lustre 2.4.2
    • Lustre 2.4.0
    • LLNL/Hyperion
    • 3
    • 8505

    Description

      Running SWL test with NRS policy 'orr' after 25 hours OSS had LBUG, there were multiple assertions during the initial stack dump:

      2013-05-30 13:12:46 LustreError: 5770:0:(hash.c:546:cfs_hash_bd_del_locked()) ASSERTION( bd->bd_bucket->hsb_count > 0 ) failed:
      2013-05-30 13:12:46 LustreError: 5770:0:(hash.c:546:cfs_hash_bd_del_locked()) LBUG
      2013-05-30 13:12:46 Pid: 5770, comm: ll_ost_io00_077
      2013-05-30 13:12:46
      2013-05-30 13:12:46 Call Trace:
      2013-05-30 13:12:46  [<ffffffffa04d1895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      2013-05-30 13:12:46 May 30 13:12:46  [<ffffffffa04d1e97>] lbug_with_loc+0x47/0xb0 [libcfs]
      2013-05-30 13:12:46 hyperion-dit33 k [<ffffffffa04e785a>] cfs_hash_bd_del_locked+0xda/0x140 [libcfs]
      2013-05-30 13:12:46 ernel: LustreErr [<ffffffffa0a467e8>] nrs_orr_hop_put_free+0x218/0x290 [ptlrpc]
      2013-05-30 13:12:46 or: 5770:0:(hash [<ffffffffa0a456d8>] nrs_orr_res_put+0x28/0x60 [ptlrpc]
      2013-05-30 13:12:46 .c:546:cfs_hash_ [<ffffffffa0a3eb80>] nrs_resource_put_safe+0x60/0xf0 [ptlrpc]
      2013-05-30 13:12:46 bd_del_locked()) [<ffffffffa0a3ec30>] ptlrpc_nrs_req_finalize+0x20/0x30 [ptlrpc]
      2013-05-30 13:12:46  ASSERTION( bd->bd_bucket->hsb_c [<ffffffffa0a05a32>] ptlrpc_server_finish_active_request+0x62/0x150 [ptlrpc]
      2013-05-30 13:12:46 ount > 0 ) faile [<ffffffffa0a0c1a2>] ptlrpc_server_handle_request+0x1b2/0xc60 [ptlrpc]
      2013-05-30 13:12:46 d: 
      2013-05-30 13:12:46 May 30 13:12 [<ffffffffa04d25de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      2013-05-30 13:12:46 :46 hyperion-dit [<ffffffffa04e3d8f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
      2013-05-30 13:12:46 33 kernel: Lustr [<ffffffffa0a036e9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
      2013-05-30 13:12:46 eError: 5770:0:( [<ffffffffa0a0d71e>] ptlrpc_main+0xace/0x1700 [ptlrpc]
      2013-05-30 13:12:46 hash.c:546:cfs_h [<ffffffffa0a0cc50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      2013-05-30 13:12:46 ash_bd_del_locke [<ffffffff8100c0ca>] child_rip+0xa/0x20
      2013-05-30 13:12:46 d()) LBUG
      2013-05-30 13:12:46  [<ffffffffa0a0cc50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      2013-05-30 13:12:46  [<ffffffffa0a0cc50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      2013-05-30 13:12:46  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      2013-05-30 13:12:46
      2013-05-30 13:12:46 Kernel panic - not syncing: LBUG
      2013-05-30 13:12:46 Pid: 5770, comm: ll_ost_io00_077 Tainted: P           ---------------    2.6.32-358.6.2.el6_lustre.g230b174.x86_64 #1
      2013-05-30 13:12:46 Call Trace:
      2013-05-30 13:12:46  [<ffffffff8150d878>] ? panic+0xa7/0x16f
      2013-05-30 13:12:46 May 30 13:12:46  [<ffffffffa04d1eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      2013-05-30 13:12:46 hyperion-dit33 k [<ffffffffa04e785a>] ? cfs_hash_bd_del_locked+0xda/0x140 [libcfs]
      2013-05-30 13:12:46 ernel: Kernel pa [<ffffffffa0a467e8>] ? nrs_orr_hop_put_free+0x218/0x290 [ptlrpc]
      2013-05-30 13:12:46 nic - not syncin [<ffffffffa0a456d8>] ? nrs_orr_res_put+0x28/0x60 [ptlrpc]
      2013-05-30 13:12:46 g: LBUG
      2013-05-30 13:12:46  [<ffffffffa0a3eb80>] ? nrs_resource_put_safe+0x60/0xf0 [ptlrpc]
      2013-05-30 13:12:46  [<ffffffffa0a3ec30>] ? ptlrpc_nrs_req_finalize+0x20/0x30 [ptlrpc]
      2013-05-30 13:12:46  [<ffffffffa0a05a32>] ? ptlrpc_server_finish_active_request+0x62/0x150 [ptlrpc]
      2013-05-30 13:12:46  [<ffffffffa0a0c1a2>] ? ptlrpc_server_handle_request+0x1b2/0xc60 [ptlrpc]
      2013-05-30 13:12:46  [<ffffffffa04d25de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      2013-05-30 13:12:46  [<ffffffffa04e3d8f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
      2013-05-30 13:12:46  [<ffffffffa0a036e9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
      2013-05-30 13:12:46  [<ffffffffa0a0d71e>] ? ptlrpc_main+0xace/0x1700 [ptlrpc]
      2013-05-30 13:12:47  [<ffffffffa0a0cc50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      2013-05-30 13:12:47  [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
      2013-05-30 13:12:47  [<ffffffffa0a0cc50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      2013-05-30 13:12:47  [<ffffffffa0a0cc50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      2013-05-30 13:12:47  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      2013-05-30 13:12:47 Initializing cgroup subsys cpuset
      

      Attachments

        Activity

          People

            emoly.liu Emoly Liu
            cliffw Cliff White (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: