[LU-11204] mdt_reint_unlink->lu_object_put() crash Created: 03/Aug/18  Updated: 21/Nov/19  Resolved: 07/Jun/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0
Fix Version/s: Lustre 2.13.0, Lustre 2.12.4

Type: Bug Priority: Minor
Reporter: Oleg Drokin Assignee: Mikhail Pershin
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by LU-12741 crash in osd_object_delete at end of ... Resolved
Related
is related to LU-9942 Use after free in mdt_mfd_close->lu_o... Open
Rank (Obsolete): 9223372036854775807

 Description   

Seeing these for some time in my testing now, in racer:

[48792.659356] BUG: unable to handle kernel paging request at ffff88008278be60
[48792.659356] IP: [<ffffffffa034f110>] lu_object_put+0x270/0x3c0 [obdclass]
[48792.659356] PGD 23e3067 PUD 33fa01067 PMD 33f9ed067 PTE 800000008278b060
[48792.659356] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[48792.659356] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) dm_flakey dm_mod loop zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) jbd2 mbcache crc_t10dif crct10dif_generic crct10dif_common ata_generic ttm pata_acpi drm_kms_helper i2c_piix4 ata_piix drm virtio_balloon pcspkr serio_raw virtio_console virtio_blk i2c_core libata floppy ip_tables rpcsec_gss_krb5 [last unloaded: libcfs]
[48792.686829] CPU: 1 PID: 21888 Comm: mdt00_002 Kdump: loaded Tainted: P           OE  ------------   3.10.0-7.5-debug #1
[48792.686829] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[48792.686829] task: ffff88009d644c80 ti: ffff8800b93ac000 task.ti: ffff8800b93ac000
[48792.686829] RIP: 0010:[<ffffffffa034f110>]  [<ffffffffa034f110>] lu_object_put+0x270/0x3c0 [obdclass]
[48792.686829] RSP: 0018:ffff8800b93afb38  EFLAGS: 00010246
[48792.686829] RAX: 0000000000000000 RBX: ffff88030ef74160 RCX: 0000000000000002
[48792.686829] RDX: 0000000000000002 RSI: ffffc90007768000 RDI: ffff88008278be68
[48792.686829] RBP: ffff8800b93afb88 R08: 00000000000000cc R09: 000000000000004f
[48792.686829] R10: 0000000000000b01 R11: 00000000003fffff R12: ffff880291d79600
[48792.686829] R13: ffff88008278bea0 R14: ffff88008278be50 R15: ffffc900077a8028
[48792.686829] FS:  0000000000000000(0000) GS:ffff88033da40000(0000) knlGS:0000000000000000
[48792.686829] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[48792.686829] CR2: ffff88008278be60 CR3: 000000024c172000 CR4: 00000000000006e0
[48792.686829] Call Trace:
[48792.686829]  [<ffffffffa0cbbb13>] mdt_reint_unlink+0x7c3/0x1410 [mdt]
[48792.686829]  [<ffffffffa0cbfc10>] mdt_reint_rec+0x80/0x210 [mdt]
[48792.686829]  [<ffffffffa0c9f6ab>] mdt_reint_internal+0x5fb/0x990 [mdt]
[48792.686829]  [<ffffffffa0caa4a7>] mdt_reint+0x67/0x140 [mdt]
[48792.686829]  [<ffffffffa05eca55>] tgt_request_handle+0xaf5/0x1590 [ptlrpc]
[48792.686829]  [<ffffffffa01eaf97>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[48792.686829]  [<ffffffffa0590eb6>] ptlrpc_server_handle_request+0x256/0xad0 [ptlrpc]
[48792.686829]  [<ffffffff810b9398>] ? __wake_up_common+0x58/0x90
[48792.686829]  [<ffffffff813ccd2b>] ? do_raw_spin_unlock+0x4b/0x90
[48792.686829]  [<ffffffffa0594cae>] ptlrpc_main+0xabe/0x1f80 [ptlrpc]
[48792.686829]  [<ffffffffa05941f0>] ? ptlrpc_register_service+0xeb0/0xeb0 [ptlrpc]
[48792.686829]  [<ffffffff810ae864>] kthread+0xe4/0xf0
[48792.686829]  [<ffffffff810ae780>] ? kthread_create_on_node+0x140/0x140
[48792.686829]  [<ffffffff81783777>] ret_from_fork_nospec_begin+0x21/0x21
[48792.686829]  [<ffffffff810ae780>] ? kthread_create_on_node+0x140/0x140
[48792.686829] Code: a0 31 c0 e8 53 be e9 ff 0f 1f 00 48 8b 03 be 01 00 00 00 48 8b 7d c0 48 8b 40 20 ff 50 18 e9 5a fe ff ff 0f 1f 84 00 00 00 00 00 <49> 8b 46 10 a8 01 0f 84 46 fe ff ff 48 8b 7d b0 31 c9 31 d2 be 
[48792.686829] RIP  [<ffffffffa034f110>] lu_object_put+0x270/0x3c0 [obdclass]
[48792.686829]  RSP <ffff8800b93afb38>
[48792.686829] CR2: ffff88008278be60


 Comments   
Comment by Andreas Dilger [ 03/Aug/18 ]

Can you use GDB to decode the line number and structure pointer to see exactly where it is crashing?

Comment by Andreas Dilger [ 03/Aug/18 ]

The last change to that part of the code is:

commit 478be95b8d938498ccf03920f934a0d49fe5dc6b
Author:     NeilBrown <neilb@suse.com>
AuthorDate: Tue May 8 22:46:29 2018 -0400

    LU-4423 obd: backport of lu_object changes upstream
    
    fold lu_object_new() into lu_object_find_at()
    
    lu_object_new() duplicates a lot of code that is in
    lu_object_find_at().
    There is no real need for a separate function, it is simpler just
    to skip the bits of lu_object_find_at() that we don't
    want in the LOC_F_NEW case.
    
    Linux-commit: 775c4dc274343e5e2959fa1171baf2fc01028840
    
    discard extra lru count.
    
    lu_object maintains 2 lru counts.
    One is a per-bucket lsb_lru_len.
    The other is the per-cpu ls_lru_len_counter.
    
    The only times the per-bucket counters are use are:
     - a debug message when an object is added
     - in lu_site_stats_get when all the counters are combined.
    
    The debug message is not essential, and the per-cpu counter
    can be used to get the combined total.
    
    So discard the per-bucket lsb_lru_len.
    Change-Id: I26203f331a0c73ae4e23878eb10b15d9fcf546c5
    Signed-off-by: NeilBrown <neilb@suse.com>
    Signed-off-by: James Simmons <uja.ornl@yahoo.com>
    Reviewed-on: https://review.whamcloud.com/32325
Comment by Oleg Drokin [ 04/Aug/18 ]
(gdb) l *(lu_object_put+0x275)
0x4f145 is in lu_object_put (/home/green/git/lustre-release/lustre/obdclass/lu_object.c:164).
159	
160		cfs_hash_bd_get(site->ls_obj_hash, &top->loh_fid, &bd);
161		bkt = cfs_hash_bd_extra_get(site->ls_obj_hash, &bd);
162	
163		if (!cfs_hash_bd_dec_and_lock(site->ls_obj_hash, &bd, &top->loh_ref)) {
164			if (lu_object_is_dying(top)) {
165				/*

(0x270 is some sort of a test_bit)

so it's a bit hard to know where exctly did it crash I guess. Note we onl recently started to do real multimountpoint racer testing after a testscript fix from John so it's not necessary a super new regression, just possibly only recently exposed.

Comment by Oleg Drokin [ 04/Aug/18 ]

ok, so it is line 163:

    /home/green/bk/linux-3.10.0-862.3.2.el7-debug/./arch/x86/include/asm/bitops.h: 319
#11 [ffff8800b93afb30] lu_object_put at ffffffffa034efe8 [obdclass]
    /home/green/git/lustre-release/lustre/obdclass/lu_object.c: 163
#12 [ffff8800b93afb90] mdt_reint_unlink at ffffffffa0cbbb13 [mdt]
    /home/green/git/lustre-release/libcfs/include/libcfs/libcfs_debug.h: 146
#13 [ffff8800b93afc10] mdt_reint_rec at ffffffffa0cbfc10 [mdt]
    /home/green/git/lustre-release/lustre/mdt/mdt_reint.c: 2375
#14 [ffff8800b93afc38] mdt_reint_internal at ffffffffa0c9f6ab [mdt]
    /home/green/git/lustre-release/libcfs/include/libcfs/libcfs_debug.h: 146

but because it's a macro, it's a bit harder to know where exactly did it hit.

Comment by Mikhail Pershin [ 04/May/19 ]

it looks similar to LU-9942

Comment by Mikhail Pershin [ 04/May/19 ]

and more older LU-9419

Comment by Mikhail Pershin [ 04/May/19 ]

each ticket has trace ending at lu_object_put line

if (!cfs_hash_bd_dec_and_lock(site->ls_obj_hash, &bd, &top->loh_ref)) {
--->		if (lu_object_is_dying(top)) {

This code path is about exit when loh_ref is not the last one, at the same time that looks like top is already destroyed at the moment of check

Comment by Mikhail Pershin [ 26/May/19 ]

The reason is the accessing top after atomic_dec_and_lock() call, at that moment top dropped own reference and is not protected so can be freed by other thread. Issue is being seen mostly on onyx-68 with many virtual machines running on the same node.
Solution can be just getting lu_object_is_dying() value before loh_ref decrement, moreover I am not sure we need this whole block of code with

		if (lu_object_is_dying(top)) {
			/*
			 * somebody may be waiting for this, currently only
			 * used for cl_object, see cl_object_put_last().
			 */
			wake_up_all(&bkt->lsb_marche_funebre);
		}

it is bz22520 https://bugzilla.lustre.org/show_bug.cgi?id=22520 and it is worth to review how things are working now and if that wake_up() in lu_object_put() is needed for every put really.

Comment by Gerrit Updater [ 26/May/19 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34960
Subject: LU-11204 obdclass: remove unprotected access to lu_object
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 98c1b95c49b79509c7d31f2cdebdc46eda54a8b4

Comment by Gerrit Updater [ 26/May/19 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34961
Subject: LU-11204 obdclass: remove unprotected access to lu_object
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7992d3d38e148f0f9c60a750ba5355413e8b1407

Comment by Mikhail Pershin [ 26/May/19 ]

I've pushed two patches, first is simple to prevent after-free access by using local variable, second patch is fortestonly to check if cl_object_put_last() is still needed. At quick view conditions described in bz22520 don't exist in current code, so whole bz22520 fix might be not needed.

Comment by Gerrit Updater [ 07/Jun/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34960/
Subject: LU-11204 obdclass: remove unprotected access to lu_object
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 336cf0f2f3a9ce5b11a34aeaeec062a5d5144213

Comment by Peter Jones [ 07/Jun/19 ]

So what's the verdict from https://review.whamcloud.com/#/c/34961/ ? Is further work needed or can this ticket be marked as RESOLVED?

Comment by Mikhail Pershin [ 07/Jun/19 ]

that was alternative approach, I've abandoned it.

Comment by Peter Jones [ 07/Jun/19 ]

ok. Should we consider this fix for b2_12?

Comment by Gerrit Updater [ 17/Sep/19 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36217
Subject: LU-11204 obdclass: remove unprotected access to lu_object
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 7db34b95768d7f2df3aa110275ea26d345431852

Comment by Gerrit Updater [ 21/Nov/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36217/
Subject: LU-11204 obdclass: remove unprotected access to lu_object
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: e548e31f3feac2831868fe01cc75bf111cf8f501

Generated at Sat Feb 10 02:41:51 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.