Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11801

replay-vbr test 0b crashes with and LBUG/ASSERTION( ctxt )

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.12.0
    • Ubuntu 18.04
    • 3
    • 9223372036854775807

    Description

      replay-vbr test_0b crashes for Ubuntu 18.04 clients with RHEL 7.6 servers. This test started crashing on 27 November 2018.

      Looking at the kernel crash from https://testing.whamcloud.com/test_sets/9f692e08-fdc9-11e8-93ea-52540065bddc , we see

      [ 5308.450564] Lustre: DEBUG MARKER: == replay-vbr test 0b: getversion for non existent fid shouldn't cause kernel panic ================== 21:08:17 (1544562497)
      [ 5308.527820] LustreError: 12286:0:(osp_sync.c:346:osp_sync_declare_add()) ASSERTION( ctxt ) failed: 
      [ 5308.528714] LustreError: 12286:0:(osp_sync.c:346:osp_sync_declare_add()) LBUG
      [ 5308.529382] Pid: 12286, comm: mdt00_000 3.10.0-957.el7_lustre.x86_64 #1 SMP Sat Dec 8 05:53:16 UTC 2018
      [ 5308.530265] Call Trace:
      [ 5308.530534]  [<ffffffffc079d7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [ 5308.531258]  [<ffffffffc079d87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [ 5308.531885]  [<ffffffffc11e0b89>] osp_sync_declare_add+0x3b9/0x3f0 [osp]
      [ 5308.532569]  [<ffffffffc11d0ce3>] osp_declare_destroy+0x1a3/0x1f0 [osp]
      [ 5308.533334]  [<ffffffffc111a85e>] lod_sub_declare_destroy+0xce/0x2d0 [lod]
      [ 5308.534219]  [<ffffffffc10f7a3d>] lod_obj_stripe_destroy_cb+0x8d/0xa0 [lod]
      [ 5308.534955]  [<ffffffffc110423e>] lod_obj_for_each_stripe+0x11e/0x2d0 [lod]
      [ 5308.535718]  [<ffffffffc110504f>] lod_declare_destroy+0x45f/0x5e0 [lod]
      [ 5308.536452]  [<ffffffffc116b081>] mdd_declare_finish_unlink+0x91/0x210 [mdd]
      [ 5308.537193]  [<ffffffffc117a9af>] mdd_unlink+0x4bf/0xad0 [mdd]
      [ 5308.537829]  [<ffffffffc1043089>] mdo_unlink+0x46/0x48 [mdt]
      [ 5308.538539]  [<ffffffffc1005e69>] mdt_reint_unlink+0xb49/0x14a0 [mdt]
      [ 5308.539308]  [<ffffffffc100c5e3>] mdt_reint_rec+0x83/0x210 [mdt]
      [ 5308.539937]  [<ffffffffc0fe9133>] mdt_reint_internal+0x6e3/0xaf0 [mdt]
      [ 5308.540621]  [<ffffffffc0ff4497>] mdt_reint+0x67/0x140 [mdt]
      [ 5308.541262]  [<ffffffffc0c8535a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
      [ 5308.542296]  [<ffffffffc0c2992b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
      [ 5308.543087]  [<ffffffffc0c2d25c>] ptlrpc_main+0xafc/0x1fc0 [ptlrpc]
      [ 5308.543835]  [<ffffffff9bcc1c31>] kthread+0xd1/0xe0
      [ 5308.544389]  [<ffffffff9c374c37>] ret_from_fork_nospec_end+0x0/0x39
      [ 5308.545029]  [<ffffffffffffffff>] 0xffffffffffffffff
      [ 5308.545592] Kernel panic - not syncing: LBUG
      [ 5308.546008] CPU: 0 PID: 12286 Comm: mdt00_000 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.el7_lustre.x86_64 #1
      [ 5308.547086] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 5308.547649] Call Trace:
      [ 5308.547916]  [<ffffffff9c361dc1>] dump_stack+0x19/0x1b
      [ 5308.548409]  [<ffffffff9c35b4d0>] panic+0xe8/0x21f
      [ 5308.548865]  [<ffffffffc079d8cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [ 5308.549448]  [<ffffffffc11e0b89>] osp_sync_declare_add+0x3b9/0x3f0 [osp]
      [ 5308.550080]  [<ffffffffc11d0ce3>] osp_declare_destroy+0x1a3/0x1f0 [osp]
      [ 5308.550705]  [<ffffffffc111a85e>] lod_sub_declare_destroy+0xce/0x2d0 [lod]
      [ 5308.551377]  [<ffffffffc10f7a3d>] lod_obj_stripe_destroy_cb+0x8d/0xa0 [lod]
      [ 5308.552040]  [<ffffffffc110423e>] lod_obj_for_each_stripe+0x11e/0x2d0 [lod]
      [ 5308.552697]  [<ffffffffc110504f>] lod_declare_destroy+0x45f/0x5e0 [lod]
      [ 5308.553459]  [<ffffffffc09e4ca4>] ? lu_env_refill+0x24/0x30 [obdclass]
      [ 5308.554081]  [<ffffffffc10f79b0>] ? lod_xattr_list+0x150/0x150 [lod]
      [ 5308.554674]  [<ffffffffc116b081>] mdd_declare_finish_unlink+0x91/0x210 [mdd]
      [ 5308.555363]  [<ffffffffc117a9af>] mdd_unlink+0x4bf/0xad0 [mdd]
      [ 5308.555929]  [<ffffffffc1043089>] mdo_unlink+0x46/0x48 [mdt]
      [ 5308.556469]  [<ffffffffc1005e69>] mdt_reint_unlink+0xb49/0x14a0 [mdt]
      [ 5308.557088]  [<ffffffffc100c5e3>] mdt_reint_rec+0x83/0x210 [mdt]
      [ 5308.557663]  [<ffffffffc0fe9133>] mdt_reint_internal+0x6e3/0xaf0 [mdt]
      [ 5308.558293]  [<ffffffffc0ff13f4>] ? mdt_thread_info_init+0xa4/0x1e0 [mdt]
      [ 5308.558933]  [<ffffffffc0ff4497>] mdt_reint+0x67/0x140 [mdt]
      [ 5308.559512]  [<ffffffffc0c8535a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
      [ 5308.560184]  [<ffffffffc07a3f07>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      [ 5308.560834]  [<ffffffffc0c2992b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
      [ 5308.561558]  [<ffffffff9bccba9b>] ? __wake_up_common+0x5b/0x90
      [ 5308.562151]  [<ffffffffc0c2d25c>] ptlrpc_main+0xafc/0x1fc0 [ptlrpc]
      [ 5308.562750]  [<ffffffff9bcd0880>] ? finish_task_switch+0x50/0x1c0
      [ 5308.563382]  [<ffffffffc0c2c760>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
      [ 5308.564084]  [<ffffffff9bcc1c31>] kthread+0xd1/0xe0
      [ 5308.564541]  [<ffffffff9bcc1b60>] ? insert_kthread_work+0x40/0x40
      [ 5308.565108]  [<ffffffff9c374c37>] ret_from_fork_nospec_begin+0x21/0x21
      [ 5308.565710]  [<ffffffff9bcc1b60>] ? insert_kthread_work+0x40/0x40
      

      There are several example of this crash
      https://testing.whamcloud.com/test_sets/375d7040-fdc8-11e8-b837-52540065bddc
      https://testing.whamcloud.com/test_sets/54b126fe-f955-11e8-b67f-52540065bddc
      https://testing.whamcloud.com/test_sets/cec17f86-f6e7-11e8-815b-52540065bddc

      Attachments

        Issue Links

          Activity

            People

              bzzz Alex Zhuravlev
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: