Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6292

replay-single test_101: osd_trans_exec_op()) ASSERTION( oh->ot_handle != ((void *)0) ) failed:

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.8.0
    • Lustre 2.8.0
    • 3
    • 17625

    Description

      This issue was created by maloo for wangdi <di.wang@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/b55511da-bd38-11e4-8d85-5254006e85c2.

      The sub-test test_101 failed with the following error:

      test failed to respond and timed out
      11:19:32:Lustre: DEBUG MARKER: == replay-single test 101: Shouldn't reassign precreated objs to other files after recovery == 11:19:09 (1424863149)
      11:19:32:Lustre: DEBUG MARKER: sync; sync; sync
      11:19:32:Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno
      11:19:32:Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly
      11:19:32:LustreError: 24268:0:(osd_handler.c:1462:osd_ro()) *** setting lustre-MDT0000 read-only ***
      11:19:32:Turning device dm-0 (0xfd00000) read-only
      11:19:32:Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000
      11:19:32:Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
      11:19:32:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts
      11:19:32:Lustre: DEBUG MARKER: umount -d /mnt/mds1
      11:19:32:Removing read-only on unknown block (0xfd00000)
      11:19:32:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      11:19:32:Lustre: DEBUG MARKER: hostname
      11:19:32:Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P1
      11:19:32:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre  -o abort_recovery 		                   /dev/lvm-Role_MDS/P1 /mnt/mds1
      11:19:32:LDISKFS-fs (dm-0): recovery complete
      11:19:32:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
      11:19:32:LustreError: 24605:0:(mdt_handler.c:5797:mdt_iocontrol()) lustre-MDT0000: Aborting recovery for device
      11:19:32:LustreError: 24605:0:(ldlm_lib.c:2261:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery
      11:19:32:Lustre: 24684:0:(ldlm_lib.c:1822:target_recovery_overseer()) recovery is aborted, evict exports in recovery
      11:19:32:Lustre: 24684:0:(ldlm_lib.c:1822:target_recovery_overseer()) Skipped 2 previous similar messages
      11:19:32:Lustre: lustre-MDT0000: disconnecting 5 stale clients
      11:19:32:LustreError: 24684:0:(osd_handler.c:4519:osd_object_find()) header@ffff88006060fd08[0x0, 4, [0x200028c71:0x19:0x0] hash]{
      11:19:32:
      11:19:32:LustreError: 24684:0:(osd_handler.c:4519:osd_object_find()) ....mdt@ffff88006060fd58mdt-object@ffff88006060fd08(ioepoch=0 flags=0x0, epochcount=0, writecount=0)
      11:19:32:
      11:19:32:LustreError: 24684:0:(osd_handler.c:4519:osd_object_find()) ....mdd@ffff88006d9a9f00mdd-object@ffff88006d9a9f00(open_count=0, valid=0, cltime=0, flags=0)
      11:19:32:
      11:19:32:LustreError: 24684:0:(osd_handler.c:4519:osd_object_find()) ....lod@ffff88006060ee48lod-object@ffff88006060ee48
      11:19:32:
      11:19:32:LustreError: 24684:0:(osd_handler.c:4519:osd_object_find()) ....osd-ldiskfs@ffff880061e98740osd-ldiskfs-object@ffff880061e98740(i:(null):0/0)[plain]
      11:19:32:
      11:19:32:LustreError: 24684:0:(osd_handler.c:4519:osd_object_find()) } header@ffff88006060fd08
      11:19:32:
      11:19:32:LustreError: 24684:0:(osd_handler.c:4519:osd_object_find()) lu_object does not exists [0x200028c71:0x19:0x0]
      11:19:32:LustreError: 24684:0:(osd_handler.c:4672:osd_index_ea_insert()) lustre-MDT0000-osd: Can not find object [0x200028c71:0x19:0x0]32776:3382146134: rc = -2
      11:19:32:LustreError: 24684:0:(osd_internal.h:979:osd_trans_exec_op()) ASSERTION( oh->ot_handle != ((void *)0) ) failed: 
      11:19:32:LustreError: 24684:0:(osd_internal.h:979:osd_trans_exec_op()) LBUG
      11:19:32:Pid: 24684, comm: tgt_recov
      11:19:32:
      11:19:32:Call Trace:
      11:19:32: [<ffffffffa0491895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      11:19:32: [<ffffffffa0491e97>] lbug_with_loc+0x47/0xb0 [libcfs]
      11:19:32: [<ffffffffa0d0df7f>] osd_trans_exec_op+0x14f/0x2e0 [osd_ldiskfs]
      11:19:32: [<ffffffffa0d18c45>] osd_index_ea_delete+0x1d5/0xd00 [osd_ldiskfs]
      11:19:32: [<ffffffff8106c85a>] ? __cond_resched+0x2a/0x40
      11:19:32: [<ffffffffa087fd33>] out_obj_index_delete+0x153/0x370 [ptlrpc]
      11:19:32: [<ffffffffa08800fc>] out_tx_index_insert_undo+0x1c/0x20 [ptlrpc]
      11:19:32: [<ffffffffa088c83c>] distribute_txn_replay_handle+0x7ec/0x940 [ptlrpc]
      11:19:32: [<ffffffffa07d86a1>] target_recovery_thread+0x9e1/0x1ad0 [ptlrpc]
      11:19:32: [<ffffffffa07d7cc0>] ? target_recovery_thread+0x0/0x1ad0 [ptlrpc]
      11:19:32: [<ffffffff8109e66e>] kthread+0x9e/0xc0
      11:19:32: [<ffffffff8100c20a>] child_rip+0xa/0x20
      11:19:32: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
      11:19:32: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      11:19:32:
      11:19:32:Kernel panic - not syncing: LBUG
      11:19:32:Pid: 24684, comm: tgt_recov Not tainted 2.6.32-504.8.1.el6_lustre.g0ef66b1.x86_64 #1
      11:19:32:Call Trace:
      11:19:32: [<ffffffff81529b76>] ? panic+0xa7/0x16f
      11:19:32: [<ffffffffa0491eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      11:19:32: [<ffffffffa0d0df7f>] ? osd_trans_exec_op+0x14f/0x2e0 [osd_ldiskfs]
      11:19:32: [<ffffffffa0d18c45>] ? osd_index_ea_delete+0x1d5/0xd00 [osd_ldiskfs]
      11:19:32: [<ffffffff8106c85a>] ? __cond_resched+0x2a/0x40
      11:19:32: [<ffffffffa087fd33>] ? out_obj_index_delete+0x153/0x370 [ptlrpc]
      11:19:32: [<ffffffffa08800fc>] ? out_tx_index_insert_undo+0x1c/0x20 [ptlrpc]
      11:19:32: [<ffffffffa088c83c>] ? distribute_txn_replay_handle+0x7ec/0x940 [ptlrpc]
      11:19:32: [<ffffffffa07d86a1>] ? target_recovery_thread+0x9e1/0x1ad0 [ptlrpc]
      11:19:32: [<ffffffffa07d7cc0>] ? target_recovery_thread+0x0/0x1ad0 [ptlrpc]
      11:19:32: [<ffffffff8109e66e>] ? kthread+0x9e/0xc0
      11:19:32: [<ffffffff8100c20a>] ? child_rip+0xa/0x20
      11:19:32: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
      11:19:32: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      

      Please provide additional information about the failure here.

      Info required for matching: replay-single 101

      Attachments

        Issue Links

          Activity

            People

              di.wang Di Wang
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: