[LU-12934] racer test 1 crashes in osc_object_ast_clear with LBUG: ASSERTION( lvb != ((void *)0) ) Created: 04/Nov/19  Updated: 10/Oct/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.13.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Nunez (Inactive) Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

racer test_1 crashes. Looking at the kernel-crash for https://testing.whamcloud.com/test_sets/0ea04a28-fd8c-11e9-8e77-52540065bddc, we see

[65366.958340] LustreError: 11867:0:(ldlm_resource.c:1147:ldlm_resource_complain()) lustre-MDT0000-mdc-ffff8ced36962800: namespace resource [0x200000401:0x25e:0x0].0x0 (00000000a3ed343f) refcount nonzero (1) after lock cleanup; forcing cleanup.
[65366.964792] LustreError: 11343:0:(llite_lib.c:1677:ll_md_setattr()) md_setattr fails: rc = -108
[65366.967916] LustreError: 11895:0:(file.c:4634:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000007:0x1:0x0] error: rc = -108
[65366.991686] LustreError: 12895:0:(mdc_request.c:1466:mdc_read_page()) lustre-MDT0000-mdc-ffff8ced36962800: [0x200000401:0x1:0x0] lock enqueue fails: rc = -108
[65367.087953] LustreError: 3825:0:(vvp_io.c:1616:vvp_io_init()) lustre: refresh file layout [0x200000401:0x25e:0x0] error -108.
[65367.088311] LustreError: 11867:0:(osc_object.c:213:osc_object_ast_clear()) ASSERTION( lvb != ((void *)0) ) failed: 
[65367.091759] LustreError: 11867:0:(osc_object.c:213:osc_object_ast_clear()) LBUG
[65367.093035] Pid: 11867, comm: ll_imp_inval 4.18.0-80.7.1.el8_0.x86_64 #1 SMP Sat Aug 3 15:14:00 UTC 2019
[65367.094664] Call Trace:
[65367.095404]  libcfs_call_trace+0x86/0xc0 [libcfs]
[65367.096261]  lbug_with_loc+0x43/0x80 [libcfs]
[65367.097133]  osc_object_ast_clear+0x304/0x340 [osc]
[65367.098282]  ldlm_resource_foreach+0xd4/0x250 [ptlrpc]
[65367.099228]  ldlm_resource_iterate+0x120/0x170 [ptlrpc]
[65367.100171]  osc_object_prune+0x5b/0x90 [osc]
[65367.100965]  osc_object_invalidate+0x84/0x270 [osc]
[65367.101846]  osc_ldlm_resource_invalidate+0xbd/0x160 [osc]
[65367.102835]  cfs_hash_for_each_relax+0x253/0x450 [libcfs]
[65367.103811]  cfs_hash_for_each_nolock+0x11d/0x1a0 [libcfs]
[65367.104845]  mdc_import_event+0x2ad/0xa60 [mdc]
[65367.105692]  ptlrpc_invalidate_import+0x447/0xa70 [ptlrpc]
[65367.106697]  ptlrpc_invalidate_import_thread+0x3e/0x1c0 [ptlrpc]
[65367.107802]  kthread+0x112/0x130
[65367.108436]  ret_from_fork+0x35/0x40
[65367.109115]  0xffffffffffffffff
[65367.109716] Kernel panic - not syncing: LBUG
[65367.110493] CPU: 0 PID: 11867 Comm: ll_imp_inval Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-80.7.1.el8_0.x86_64 #1
[65367.112516] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[65367.113519] Call Trace:
[65367.114025]  dump_stack+0x5c/0x80
[65367.114660]  panic+0xe7/0x247
[65367.115225]  lbug_with_loc.cold.3+0x18/0x18 [libcfs]
[65367.116123]  osc_object_ast_clear+0x304/0x340 [osc]
[65367.116999]  ? osc_obj_build_res_name+0x100/0x100 [osc]
[65367.117953]  ldlm_resource_foreach+0xd4/0x250 [ptlrpc]
[65367.118877]  ? osc_obj_build_res_name+0x100/0x100 [osc]
[65367.119819]  ldlm_resource_iterate+0x120/0x170 [ptlrpc]
[65367.120743]  osc_object_prune+0x5b/0x90 [osc]
[65367.121528]  osc_object_invalidate+0x84/0x270 [osc]
[65367.122398]  ? wake_up_q+0x70/0x70
[65367.123033]  osc_ldlm_resource_invalidate+0xbd/0x160 [osc]
[65367.124007]  cfs_hash_for_each_relax+0x253/0x450 [libcfs]
[65367.124961]  ? osc_resource_get_unused.constprop.44+0x310/0x310 [osc]
[65367.126080]  ? osc_resource_get_unused.constprop.44+0x310/0x310 [osc]
[65367.127240]  cfs_hash_for_each_nolock+0x11d/0x1a0 [libcfs]
[65367.128225]  mdc_import_event+0x2ad/0xa60 [mdc]
[65367.129061]  ptlrpc_invalidate_import+0x447/0xa70 [ptlrpc]
[65367.130020]  ? wake_up_q+0x70/0x70
[65367.130671]  ? ptlrpc_import_recovery_state_machine+0x900/0x900 [ptlrpc]
[65367.131855]  ptlrpc_invalidate_import_thread+0x3e/0x1c0 [ptlrpc]
[65367.132935]  kthread+0x112/0x130
[65367.133541]  ? kthread_bind+0x30/0x30
[65367.134210]  ret_from_fork+0x35/0x40

Looking at racer test 1 crashes for the past month, since 01 OCT 2019, we’ve only seen this LBUG twice:
https://testing.whamcloud.com/test_sets/7d1b7852-f273-11e9-a0ba-52540065bddc


Generated at Sat Feb 10 02:56:55 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.