Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.13.0
-
None
-
3
-
9223372036854775807
Description
racer test_1 crashes. Looking at the kernel-crash for https://testing.whamcloud.com/test_sets/0ea04a28-fd8c-11e9-8e77-52540065bddc, we see
[65366.958340] LustreError: 11867:0:(ldlm_resource.c:1147:ldlm_resource_complain()) lustre-MDT0000-mdc-ffff8ced36962800: namespace resource [0x200000401:0x25e:0x0].0x0 (00000000a3ed343f) refcount nonzero (1) after lock cleanup; forcing cleanup. [65366.964792] LustreError: 11343:0:(llite_lib.c:1677:ll_md_setattr()) md_setattr fails: rc = -108 [65366.967916] LustreError: 11895:0:(file.c:4634:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000007:0x1:0x0] error: rc = -108 [65366.991686] LustreError: 12895:0:(mdc_request.c:1466:mdc_read_page()) lustre-MDT0000-mdc-ffff8ced36962800: [0x200000401:0x1:0x0] lock enqueue fails: rc = -108 [65367.087953] LustreError: 3825:0:(vvp_io.c:1616:vvp_io_init()) lustre: refresh file layout [0x200000401:0x25e:0x0] error -108. [65367.088311] LustreError: 11867:0:(osc_object.c:213:osc_object_ast_clear()) ASSERTION( lvb != ((void *)0) ) failed: [65367.091759] LustreError: 11867:0:(osc_object.c:213:osc_object_ast_clear()) LBUG [65367.093035] Pid: 11867, comm: ll_imp_inval 4.18.0-80.7.1.el8_0.x86_64 #1 SMP Sat Aug 3 15:14:00 UTC 2019 [65367.094664] Call Trace: [65367.095404] libcfs_call_trace+0x86/0xc0 [libcfs] [65367.096261] lbug_with_loc+0x43/0x80 [libcfs] [65367.097133] osc_object_ast_clear+0x304/0x340 [osc] [65367.098282] ldlm_resource_foreach+0xd4/0x250 [ptlrpc] [65367.099228] ldlm_resource_iterate+0x120/0x170 [ptlrpc] [65367.100171] osc_object_prune+0x5b/0x90 [osc] [65367.100965] osc_object_invalidate+0x84/0x270 [osc] [65367.101846] osc_ldlm_resource_invalidate+0xbd/0x160 [osc] [65367.102835] cfs_hash_for_each_relax+0x253/0x450 [libcfs] [65367.103811] cfs_hash_for_each_nolock+0x11d/0x1a0 [libcfs] [65367.104845] mdc_import_event+0x2ad/0xa60 [mdc] [65367.105692] ptlrpc_invalidate_import+0x447/0xa70 [ptlrpc] [65367.106697] ptlrpc_invalidate_import_thread+0x3e/0x1c0 [ptlrpc] [65367.107802] kthread+0x112/0x130 [65367.108436] ret_from_fork+0x35/0x40 [65367.109115] 0xffffffffffffffff [65367.109716] Kernel panic - not syncing: LBUG [65367.110493] CPU: 0 PID: 11867 Comm: ll_imp_inval Kdump: loaded Tainted: G OE --------- - - 4.18.0-80.7.1.el8_0.x86_64 #1 [65367.112516] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [65367.113519] Call Trace: [65367.114025] dump_stack+0x5c/0x80 [65367.114660] panic+0xe7/0x247 [65367.115225] lbug_with_loc.cold.3+0x18/0x18 [libcfs] [65367.116123] osc_object_ast_clear+0x304/0x340 [osc] [65367.116999] ? osc_obj_build_res_name+0x100/0x100 [osc] [65367.117953] ldlm_resource_foreach+0xd4/0x250 [ptlrpc] [65367.118877] ? osc_obj_build_res_name+0x100/0x100 [osc] [65367.119819] ldlm_resource_iterate+0x120/0x170 [ptlrpc] [65367.120743] osc_object_prune+0x5b/0x90 [osc] [65367.121528] osc_object_invalidate+0x84/0x270 [osc] [65367.122398] ? wake_up_q+0x70/0x70 [65367.123033] osc_ldlm_resource_invalidate+0xbd/0x160 [osc] [65367.124007] cfs_hash_for_each_relax+0x253/0x450 [libcfs] [65367.124961] ? osc_resource_get_unused.constprop.44+0x310/0x310 [osc] [65367.126080] ? osc_resource_get_unused.constprop.44+0x310/0x310 [osc] [65367.127240] cfs_hash_for_each_nolock+0x11d/0x1a0 [libcfs] [65367.128225] mdc_import_event+0x2ad/0xa60 [mdc] [65367.129061] ptlrpc_invalidate_import+0x447/0xa70 [ptlrpc] [65367.130020] ? wake_up_q+0x70/0x70 [65367.130671] ? ptlrpc_import_recovery_state_machine+0x900/0x900 [ptlrpc] [65367.131855] ptlrpc_invalidate_import_thread+0x3e/0x1c0 [ptlrpc] [65367.132935] kthread+0x112/0x130 [65367.133541] ? kthread_bind+0x30/0x30 [65367.134210] ret_from_fork+0x35/0x40
Looking at racer test 1 crashes for the past month, since 01 OCT 2019, we’ve only seen this LBUG twice:
https://testing.whamcloud.com/test_sets/7d1b7852-f273-11e9-a0ba-52540065bddc