[LU-12007] lu_site_purge_objects()) ASSERTION( atomic_read(&h->loh_ref) == 0 ) failed Created: 25/Feb/19  Updated: 02/May/22  Resolved: 02/May/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.13.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Oleg Drokin Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

I am seeing this assertion recovery-small test 111 somewhat rarely, like once every other month or two.

[255143.062705] LustreError: 26477:0:(mdd_device.c:583:mdd_changelog_init()) lustre-MDD0000: changelog setup during init failed: rc = -5
[255143.065953] LustreError: 26477:0:(mdd_device.c:1249:mdd_prepare()) lustre-MDD0000: failed to initialize changelog: rc = -5
[255143.069446] LustreError: 26477:0:(obd_mount_server.c:1939:server_fill_super()) Unable to start targets: -5
[255143.079679] BUG: sleeping function called from invalid context at kernel/rwsem.c:51
[255143.079823] LustreError: 26744:0:(llog.c:694:llog_process_thread()) lustre-MDT0001-osp-MDT0000 retry remote llog process
[255143.079865] LustreError: 26744:0:(lod_dev.c:434:lod_sub_recovery_thread()) lustre-MDT0001-osp-MDT0000 get update log failed: rc = -11
[255143.099518] in_atomic(): 1, irqs_disabled(): 0, pid: 26477, name: mount.lustre
[255143.102104] CPU: 12 PID: 26477 Comm: mount.lustre Kdump: loaded Tainted: P           OE  ------------   3.10.0-7.6-debug #1
[255143.125076] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[255143.132190] Call Trace:
[255143.133464]  [<ffffffff817afbf2>] dump_stack+0x19/0x1b
[255143.134858]  [<ffffffff810c3bc9>] __might_sleep+0xd9/0x100
[255143.138244]  [<ffffffff817b6460>] down_write+0x20/0x50
[255143.139542]  [<ffffffffa11d5037>] osp_invalidate+0x177/0x210 [osp]
[255143.141445]  [<ffffffffa11ea8a3>] osp_trans_stop_cb+0x133/0x180 [osp]
[255143.143918]  [<ffffffffa11ed647>] osp_trans_callback+0xa7/0xc0 [osp]
[255143.146167]  [<ffffffffa11cb2c8>] osp_update_fini+0xc8/0x280 [osp]
[255143.150768]  [<ffffffff810b6050>] ? wake_up_atomic_t+0x30/0x30
[255143.153787]  [<ffffffffa11cb6e2>] osp_process_config+0x262/0x560 [osp]
[255143.156740]  [<ffffffffa112b248>] lod_sub_process_config+0xe8/0x1e0 [lod]
[255143.160117]  [<ffffffffa1132650>] lod_process_config+0x4c0/0x1420 [lod]
[255143.172658]  [<ffffffffa0167fa7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[255143.175946]  [<ffffffffa0ff4f68>] mdd_process_config+0x88/0x5d0 [mdd]
[255143.179300]  [<ffffffffa105f29f>] mdt_device_fini+0x2df/0xfc0 [mdt]
[255143.180936]  [<ffffffffa030c93c>] class_cleanup+0x55c/0xbb0 [obdclass]
[255143.182587]  [<ffffffffa030dc0c>] class_process_config+0x65c/0x2800 [obdclass]
[255143.185860]  [<ffffffffa0167fa7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[255143.187361]  [<ffffffffa030ff76>] class_manual_cleanup+0x1c6/0x6d0 [obdclass]
[255143.190003]  [<ffffffffa033f64e>] server_put_super+0x8ae/0xca0 [obdclass]
[255143.201249]  [<ffffffffa03435f3>] server_fill_super+0xdf3/0x1890 [obdclass]
[255143.202612]  [<ffffffffa0167fa7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[255143.204181]  [<ffffffffa031a618>] lustre_fill_super+0x3d8/0x8c0 [obdclass]
[255143.217626]  [<ffffffffa031a240>] ? lustre_common_put_super+0xb00/0xb00 [obdclass]
[255143.220239]  [<ffffffff8123a47d>] mount_nodev+0x4d/0xb0
[255143.221690]  [<ffffffffa0312968>] lustre_mount+0x38/0x60 [obdclass]
[255143.223106]  [<ffffffff8123aff9>] mount_fs+0x39/0x1b0
[255143.224274]  [<ffffffff81258b27>] vfs_kern_mount+0x67/0x110
[255143.225491]  [<ffffffff8125b89f>] do_mount+0x1ef/0xce0
[255143.226631]  [<ffffffff8123329e>] ? __check_object_size+0x1ce/0x230
[255143.227841]  [<ffffffff8125c6d3>] SyS_mount+0x83/0xd0
[255143.229098]  [<ffffffff817c4e15>] system_call_fastpath+0x1c/0x21
[255143.235390]  [<ffffffff817c4d61>] ? system_call_after_swapgs+0xae/0x146
[255143.237658] BUG: scheduling while atomic: mount.lustre/26477/0x10000002
[255143.239738] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) crc_t10dif crct10dif_generic crct10dif_common virtio_balloon i2c_piix4 virtio_console pcspkr ip_tables rpcsec_gss_krb5 ata_generic pata_acpi drm_kms_helper ttm drm ata_piix drm_panel_orientation_quirks libata serio_raw virtio_blk i2c_core floppy [last unloaded: libcfs]
[255143.251492] CPU: 12 PID: 26477 Comm: mount.lustre Kdump: loaded Tainted: P           OE  ------------   3.10.0-7.6-debug #1
[255143.285785] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[255143.287551] Call Trace:
[255143.288588]  [<ffffffff817afbf2>] dump_stack+0x19/0x1b
[255143.289785]  [<ffffffff817aa97c>] __schedule_bug+0x64/0x72
[255143.290938]  [<ffffffff817b71ef>] __schedule+0x92f/0xa00
[255143.292521]  [<ffffffff817c4e15>] ? system_call_fastpath+0x1c/0x21
[255143.293586]  [<ffffffff810c72d6>] __cond_resched+0x26/0x30
[255143.294607]  [<ffffffff817b759a>] _cond_resched+0x3a/0x50
[255143.295730]  [<ffffffff817b6465>] down_write+0x25/0x50
[255143.296964]  [<ffffffffa11d5037>] osp_invalidate+0x177/0x210 [osp]
[255143.304211]  [<ffffffffa11ea8a3>] osp_trans_stop_cb+0x133/0x180 [osp]
[255143.307398]  [<ffffffffa11ed647>] osp_trans_callback+0xa7/0xc0 [osp]
[255143.309963]  [<ffffffffa11cb2c8>] osp_update_fini+0xc8/0x280 [osp]
[255143.312548]  [<ffffffff810b6050>] ? wake_up_atomic_t+0x30/0x30
[255143.313819]  [<ffffffffa11cb6e2>] osp_process_config+0x262/0x560 [osp]
[255143.315161]  [<ffffffffa112b248>] lod_sub_process_config+0xe8/0x1e0 [lod]
[255143.316068]  [<ffffffffa1132650>] lod_process_config+0x4c0/0x1420 [lod]
[255143.316891]  [<ffffffffa0167fa7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[255143.317854]  [<ffffffffa0ff4f68>] mdd_process_config+0x88/0x5d0 [mdd]
[255143.319268]  [<ffffffffa105f29f>] mdt_device_fini+0x2df/0xfc0 [mdt]
[255143.320614]  [<ffffffffa030c93c>] class_cleanup+0x55c/0xbb0 [obdclass]
[255143.321904]  [<ffffffffa030dc0c>] class_process_config+0x65c/0x2800 [obdclass]
[255143.324469]  [<ffffffffa0167fa7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[255143.325593]  [<ffffffffa030ff76>] class_manual_cleanup+0x1c6/0x6d0 [obdclass]
[255143.329255]  [<ffffffffa033f64e>] server_put_super+0x8ae/0xca0 [obdclass]
[255143.330410]  [<ffffffffa03435f3>] server_fill_super+0xdf3/0x1890 [obdclass]
[255143.331512]  [<ffffffffa0167fa7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[255143.336940]  [<ffffffffa031a618>] lustre_fill_super+0x3d8/0x8c0 [obdclass]
[255143.338093]  [<ffffffffa031a240>] ? lustre_common_put_super+0xb00/0xb00 [obdclass]
[255143.340163]  [<ffffffff8123a47d>] mount_nodev+0x4d/0xb0
[255143.341184]  [<ffffffffa0312968>] lustre_mount+0x38/0x60 [obdclass]
[255143.342194]  [<ffffffff8123aff9>] mount_fs+0x39/0x1b0
[255143.343295]  [<ffffffff81258b27>] vfs_kern_mount+0x67/0x110
[255143.344256]  [<ffffffff8125b89f>] do_mount+0x1ef/0xce0
[255143.345222]  [<ffffffff8123329e>] ? __check_object_size+0x1ce/0x230
[255143.346646]  [<ffffffff8125c6d3>] SyS_mount+0x83/0xd0
[255143.349227]  [<ffffffff817c4e15>] system_call_fastpath+0x1c/0x21
[255143.350255]  [<ffffffff817c4d61>] ? system_call_after_swapgs+0xae/0x146
[255143.630623] LustreError: 26477:0:(lu_object.c:425:lu_site_purge_objects()) ASSERTION( atomic_read(&h->loh_ref) == 0 ) failed: 
[255143.646185] LustreError: 26477:0:(lu_object.c:425:lu_site_purge_objects()) LBUG
[255143.648524] Pid: 26477, comm: mount.lustre 3.10.0-7.6-debug #1 SMP Wed Nov 7 21:55:08 EST 2018
[255143.650717] Call Trace:
[255143.651878]  [<ffffffffa01617dc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[255143.653226]  [<ffffffffa016188c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[255143.654447]  [<ffffffffa031dc01>] lu_site_purge_objects+0x511/0x530 [obdclass]
[255143.657119]  [<ffffffffa105e414>] mdt_stack_fini+0x94/0xc40 [mdt]
[255143.658583]  [<ffffffffa105f5df>] mdt_device_fini+0x61f/0xfc0 [mdt]
[255143.659820]  [<ffffffffa030c93c>] class_cleanup+0x55c/0xbb0 [obdclass]
[255143.660985]  [<ffffffffa030dc0c>] class_process_config+0x65c/0x2800 [obdclass]
[255143.663325]  [<ffffffffa030ff76>] class_manual_cleanup+0x1c6/0x6d0 [obdclass]
[255143.665501]  [<ffffffffa033f64e>] server_put_super+0x8ae/0xca0 [obdclass]
[255143.667083]  [<ffffffffa03435f3>] server_fill_super+0xdf3/0x1890 [obdclass]
[255143.668455]  [<ffffffffa031a618>] lustre_fill_super+0x3d8/0x8c0 [obdclass]
[255143.670448]  [<ffffffff8123a47d>] mount_nodev+0x4d/0xb0
[255143.671672]  [<ffffffffa0312968>] lustre_mount+0x38/0x60 [obdclass]
[255143.672941]  [<ffffffff8123aff9>] mount_fs+0x39/0x1b0
[255143.678252]  [<ffffffff81258b27>] vfs_kern_mount+0x67/0x110
[255143.679459]  [<ffffffff8125b89f>] do_mount+0x1ef/0xce0
[255143.680549]  [<ffffffff8125c6d3>] SyS_mount+0x83/0xd0
[255143.681737]  [<ffffffff817c4e15>] system_call_fastpath+0x1c/0x21
[255143.683097]  [<ffffffffffffffff>] 0xffffffffffffffff
[255143.684381] Kernel panic - not syncing: LBUG


 Comments   
Comment by Andreas Dilger [ 02/May/22 ]

No new reports in a few years.

Generated at Sat Feb 10 02:48:51 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.