Details
-
Bug
-
Resolution: Duplicate
-
Minor
-
None
-
Lustre 2.12.0
-
Ubuntu 18.04
-
3
-
9223372036854775807
Description
replay-vbr test_0b crashes for Ubuntu 18.04 clients with RHEL 7.6 servers. This test started crashing on 27 November 2018.
Looking at the kernel crash from https://testing.whamcloud.com/test_sets/9f692e08-fdc9-11e8-93ea-52540065bddc , we see
[ 5308.450564] Lustre: DEBUG MARKER: == replay-vbr test 0b: getversion for non existent fid shouldn't cause kernel panic ================== 21:08:17 (1544562497) [ 5308.527820] LustreError: 12286:0:(osp_sync.c:346:osp_sync_declare_add()) ASSERTION( ctxt ) failed: [ 5308.528714] LustreError: 12286:0:(osp_sync.c:346:osp_sync_declare_add()) LBUG [ 5308.529382] Pid: 12286, comm: mdt00_000 3.10.0-957.el7_lustre.x86_64 #1 SMP Sat Dec 8 05:53:16 UTC 2018 [ 5308.530265] Call Trace: [ 5308.530534] [<ffffffffc079d7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs] [ 5308.531258] [<ffffffffc079d87c>] lbug_with_loc+0x4c/0xa0 [libcfs] [ 5308.531885] [<ffffffffc11e0b89>] osp_sync_declare_add+0x3b9/0x3f0 [osp] [ 5308.532569] [<ffffffffc11d0ce3>] osp_declare_destroy+0x1a3/0x1f0 [osp] [ 5308.533334] [<ffffffffc111a85e>] lod_sub_declare_destroy+0xce/0x2d0 [lod] [ 5308.534219] [<ffffffffc10f7a3d>] lod_obj_stripe_destroy_cb+0x8d/0xa0 [lod] [ 5308.534955] [<ffffffffc110423e>] lod_obj_for_each_stripe+0x11e/0x2d0 [lod] [ 5308.535718] [<ffffffffc110504f>] lod_declare_destroy+0x45f/0x5e0 [lod] [ 5308.536452] [<ffffffffc116b081>] mdd_declare_finish_unlink+0x91/0x210 [mdd] [ 5308.537193] [<ffffffffc117a9af>] mdd_unlink+0x4bf/0xad0 [mdd] [ 5308.537829] [<ffffffffc1043089>] mdo_unlink+0x46/0x48 [mdt] [ 5308.538539] [<ffffffffc1005e69>] mdt_reint_unlink+0xb49/0x14a0 [mdt] [ 5308.539308] [<ffffffffc100c5e3>] mdt_reint_rec+0x83/0x210 [mdt] [ 5308.539937] [<ffffffffc0fe9133>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [ 5308.540621] [<ffffffffc0ff4497>] mdt_reint+0x67/0x140 [mdt] [ 5308.541262] [<ffffffffc0c8535a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [ 5308.542296] [<ffffffffc0c2992b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [ 5308.543087] [<ffffffffc0c2d25c>] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] [ 5308.543835] [<ffffffff9bcc1c31>] kthread+0xd1/0xe0 [ 5308.544389] [<ffffffff9c374c37>] ret_from_fork_nospec_end+0x0/0x39 [ 5308.545029] [<ffffffffffffffff>] 0xffffffffffffffff [ 5308.545592] Kernel panic - not syncing: LBUG [ 5308.546008] CPU: 0 PID: 12286 Comm: mdt00_000 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7_lustre.x86_64 #1 [ 5308.547086] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 5308.547649] Call Trace: [ 5308.547916] [<ffffffff9c361dc1>] dump_stack+0x19/0x1b [ 5308.548409] [<ffffffff9c35b4d0>] panic+0xe8/0x21f [ 5308.548865] [<ffffffffc079d8cb>] lbug_with_loc+0x9b/0xa0 [libcfs] [ 5308.549448] [<ffffffffc11e0b89>] osp_sync_declare_add+0x3b9/0x3f0 [osp] [ 5308.550080] [<ffffffffc11d0ce3>] osp_declare_destroy+0x1a3/0x1f0 [osp] [ 5308.550705] [<ffffffffc111a85e>] lod_sub_declare_destroy+0xce/0x2d0 [lod] [ 5308.551377] [<ffffffffc10f7a3d>] lod_obj_stripe_destroy_cb+0x8d/0xa0 [lod] [ 5308.552040] [<ffffffffc110423e>] lod_obj_for_each_stripe+0x11e/0x2d0 [lod] [ 5308.552697] [<ffffffffc110504f>] lod_declare_destroy+0x45f/0x5e0 [lod] [ 5308.553459] [<ffffffffc09e4ca4>] ? lu_env_refill+0x24/0x30 [obdclass] [ 5308.554081] [<ffffffffc10f79b0>] ? lod_xattr_list+0x150/0x150 [lod] [ 5308.554674] [<ffffffffc116b081>] mdd_declare_finish_unlink+0x91/0x210 [mdd] [ 5308.555363] [<ffffffffc117a9af>] mdd_unlink+0x4bf/0xad0 [mdd] [ 5308.555929] [<ffffffffc1043089>] mdo_unlink+0x46/0x48 [mdt] [ 5308.556469] [<ffffffffc1005e69>] mdt_reint_unlink+0xb49/0x14a0 [mdt] [ 5308.557088] [<ffffffffc100c5e3>] mdt_reint_rec+0x83/0x210 [mdt] [ 5308.557663] [<ffffffffc0fe9133>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [ 5308.558293] [<ffffffffc0ff13f4>] ? mdt_thread_info_init+0xa4/0x1e0 [mdt] [ 5308.558933] [<ffffffffc0ff4497>] mdt_reint+0x67/0x140 [mdt] [ 5308.559512] [<ffffffffc0c8535a>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [ 5308.560184] [<ffffffffc07a3f07>] ? libcfs_debug_msg+0x57/0x80 [libcfs] [ 5308.560834] [<ffffffffc0c2992b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [ 5308.561558] [<ffffffff9bccba9b>] ? __wake_up_common+0x5b/0x90 [ 5308.562151] [<ffffffffc0c2d25c>] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] [ 5308.562750] [<ffffffff9bcd0880>] ? finish_task_switch+0x50/0x1c0 [ 5308.563382] [<ffffffffc0c2c760>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [ 5308.564084] [<ffffffff9bcc1c31>] kthread+0xd1/0xe0 [ 5308.564541] [<ffffffff9bcc1b60>] ? insert_kthread_work+0x40/0x40 [ 5308.565108] [<ffffffff9c374c37>] ret_from_fork_nospec_begin+0x21/0x21 [ 5308.565710] [<ffffffff9bcc1b60>] ? insert_kthread_work+0x40/0x40
There are several example of this crash
https://testing.whamcloud.com/test_sets/375d7040-fdc8-11e8-b837-52540065bddc
https://testing.whamcloud.com/test_sets/54b126fe-f955-11e8-b67f-52540065bddc
https://testing.whamcloud.com/test_sets/cec17f86-f6e7-11e8-815b-52540065bddc