[LU-11204] mdt_reint_unlink->lu_object_put() crash Created: 03/Aug/18 Updated: 21/Nov/19 Resolved: 07/Jun/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0 |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.12.4 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Oleg Drokin | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
Seeing these for some time in my testing now, in racer: [48792.659356] BUG: unable to handle kernel paging request at ffff88008278be60 [48792.659356] IP: [<ffffffffa034f110>] lu_object_put+0x270/0x3c0 [obdclass] [48792.659356] PGD 23e3067 PUD 33fa01067 PMD 33f9ed067 PTE 800000008278b060 [48792.659356] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [48792.659356] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) dm_flakey dm_mod loop zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) jbd2 mbcache crc_t10dif crct10dif_generic crct10dif_common ata_generic ttm pata_acpi drm_kms_helper i2c_piix4 ata_piix drm virtio_balloon pcspkr serio_raw virtio_console virtio_blk i2c_core libata floppy ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [48792.686829] CPU: 1 PID: 21888 Comm: mdt00_002 Kdump: loaded Tainted: P OE ------------ 3.10.0-7.5-debug #1 [48792.686829] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [48792.686829] task: ffff88009d644c80 ti: ffff8800b93ac000 task.ti: ffff8800b93ac000 [48792.686829] RIP: 0010:[<ffffffffa034f110>] [<ffffffffa034f110>] lu_object_put+0x270/0x3c0 [obdclass] [48792.686829] RSP: 0018:ffff8800b93afb38 EFLAGS: 00010246 [48792.686829] RAX: 0000000000000000 RBX: ffff88030ef74160 RCX: 0000000000000002 [48792.686829] RDX: 0000000000000002 RSI: ffffc90007768000 RDI: ffff88008278be68 [48792.686829] RBP: ffff8800b93afb88 R08: 00000000000000cc R09: 000000000000004f [48792.686829] R10: 0000000000000b01 R11: 00000000003fffff R12: ffff880291d79600 [48792.686829] R13: ffff88008278bea0 R14: ffff88008278be50 R15: ffffc900077a8028 [48792.686829] FS: 0000000000000000(0000) GS:ffff88033da40000(0000) knlGS:0000000000000000 [48792.686829] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [48792.686829] CR2: ffff88008278be60 CR3: 000000024c172000 CR4: 00000000000006e0 [48792.686829] Call Trace: [48792.686829] [<ffffffffa0cbbb13>] mdt_reint_unlink+0x7c3/0x1410 [mdt] [48792.686829] [<ffffffffa0cbfc10>] mdt_reint_rec+0x80/0x210 [mdt] [48792.686829] [<ffffffffa0c9f6ab>] mdt_reint_internal+0x5fb/0x990 [mdt] [48792.686829] [<ffffffffa0caa4a7>] mdt_reint+0x67/0x140 [mdt] [48792.686829] [<ffffffffa05eca55>] tgt_request_handle+0xaf5/0x1590 [ptlrpc] [48792.686829] [<ffffffffa01eaf97>] ? libcfs_debug_msg+0x57/0x80 [libcfs] [48792.686829] [<ffffffffa0590eb6>] ptlrpc_server_handle_request+0x256/0xad0 [ptlrpc] [48792.686829] [<ffffffff810b9398>] ? __wake_up_common+0x58/0x90 [48792.686829] [<ffffffff813ccd2b>] ? do_raw_spin_unlock+0x4b/0x90 [48792.686829] [<ffffffffa0594cae>] ptlrpc_main+0xabe/0x1f80 [ptlrpc] [48792.686829] [<ffffffffa05941f0>] ? ptlrpc_register_service+0xeb0/0xeb0 [ptlrpc] [48792.686829] [<ffffffff810ae864>] kthread+0xe4/0xf0 [48792.686829] [<ffffffff810ae780>] ? kthread_create_on_node+0x140/0x140 [48792.686829] [<ffffffff81783777>] ret_from_fork_nospec_begin+0x21/0x21 [48792.686829] [<ffffffff810ae780>] ? kthread_create_on_node+0x140/0x140 [48792.686829] Code: a0 31 c0 e8 53 be e9 ff 0f 1f 00 48 8b 03 be 01 00 00 00 48 8b 7d c0 48 8b 40 20 ff 50 18 e9 5a fe ff ff 0f 1f 84 00 00 00 00 00 <49> 8b 46 10 a8 01 0f 84 46 fe ff ff 48 8b 7d b0 31 c9 31 d2 be [48792.686829] RIP [<ffffffffa034f110>] lu_object_put+0x270/0x3c0 [obdclass] [48792.686829] RSP <ffff8800b93afb38> [48792.686829] CR2: ffff88008278be60 |
| Comments |
| Comment by Andreas Dilger [ 03/Aug/18 ] |
|
Can you use GDB to decode the line number and structure pointer to see exactly where it is crashing? |
| Comment by Andreas Dilger [ 03/Aug/18 ] |
|
The last change to that part of the code is: commit 478be95b8d938498ccf03920f934a0d49fe5dc6b
Author: NeilBrown <neilb@suse.com>
AuthorDate: Tue May 8 22:46:29 2018 -0400
LU-4423 obd: backport of lu_object changes upstream
fold lu_object_new() into lu_object_find_at()
lu_object_new() duplicates a lot of code that is in
lu_object_find_at().
There is no real need for a separate function, it is simpler just
to skip the bits of lu_object_find_at() that we don't
want in the LOC_F_NEW case.
Linux-commit: 775c4dc274343e5e2959fa1171baf2fc01028840
discard extra lru count.
lu_object maintains 2 lru counts.
One is a per-bucket lsb_lru_len.
The other is the per-cpu ls_lru_len_counter.
The only times the per-bucket counters are use are:
- a debug message when an object is added
- in lu_site_stats_get when all the counters are combined.
The debug message is not essential, and the per-cpu counter
can be used to get the combined total.
So discard the per-bucket lsb_lru_len.
Change-Id: I26203f331a0c73ae4e23878eb10b15d9fcf546c5
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32325
|
| Comment by Oleg Drokin [ 04/Aug/18 ] |
(gdb) l *(lu_object_put+0x275)
0x4f145 is in lu_object_put (/home/green/git/lustre-release/lustre/obdclass/lu_object.c:164).
159
160 cfs_hash_bd_get(site->ls_obj_hash, &top->loh_fid, &bd);
161 bkt = cfs_hash_bd_extra_get(site->ls_obj_hash, &bd);
162
163 if (!cfs_hash_bd_dec_and_lock(site->ls_obj_hash, &bd, &top->loh_ref)) {
164 if (lu_object_is_dying(top)) {
165 /*
(0x270 is some sort of a test_bit) so it's a bit hard to know where exctly did it crash I guess. Note we onl recently started to do real multimountpoint racer testing after a testscript fix from John so it's not necessary a super new regression, just possibly only recently exposed. |
| Comment by Oleg Drokin [ 04/Aug/18 ] |
|
ok, so it is line 163: /home/green/bk/linux-3.10.0-862.3.2.el7-debug/./arch/x86/include/asm/bitops.h: 319
#11 [ffff8800b93afb30] lu_object_put at ffffffffa034efe8 [obdclass]
/home/green/git/lustre-release/lustre/obdclass/lu_object.c: 163
#12 [ffff8800b93afb90] mdt_reint_unlink at ffffffffa0cbbb13 [mdt]
/home/green/git/lustre-release/libcfs/include/libcfs/libcfs_debug.h: 146
#13 [ffff8800b93afc10] mdt_reint_rec at ffffffffa0cbfc10 [mdt]
/home/green/git/lustre-release/lustre/mdt/mdt_reint.c: 2375
#14 [ffff8800b93afc38] mdt_reint_internal at ffffffffa0c9f6ab [mdt]
/home/green/git/lustre-release/libcfs/include/libcfs/libcfs_debug.h: 146
but because it's a macro, it's a bit harder to know where exactly did it hit. |
| Comment by Mikhail Pershin [ 04/May/19 ] |
|
it looks similar to LU-9942 |
| Comment by Mikhail Pershin [ 04/May/19 ] |
|
and more older LU-9419 |
| Comment by Mikhail Pershin [ 04/May/19 ] |
|
each ticket has trace ending at lu_object_put line if (!cfs_hash_bd_dec_and_lock(site->ls_obj_hash, &bd, &top->loh_ref)) { ---> if (lu_object_is_dying(top)) { This code path is about exit when loh_ref is not the last one, at the same time that looks like top is already destroyed at the moment of check |
| Comment by Mikhail Pershin [ 26/May/19 ] |
|
The reason is the accessing top after atomic_dec_and_lock() call, at that moment top dropped own reference and is not protected so can be freed by other thread. Issue is being seen mostly on onyx-68 with many virtual machines running on the same node. if (lu_object_is_dying(top)) { /* * somebody may be waiting for this, currently only * used for cl_object, see cl_object_put_last(). */ wake_up_all(&bkt->lsb_marche_funebre); } it is bz22520 https://bugzilla.lustre.org/show_bug.cgi?id=22520 and it is worth to review how things are working now and if that wake_up() in lu_object_put() is needed for every put really. |
| Comment by Gerrit Updater [ 26/May/19 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34960 |
| Comment by Gerrit Updater [ 26/May/19 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34961 |
| Comment by Mikhail Pershin [ 26/May/19 ] |
|
I've pushed two patches, first is simple to prevent after-free access by using local variable, second patch is fortestonly to check if cl_object_put_last() is still needed. At quick view conditions described in bz22520 don't exist in current code, so whole bz22520 fix might be not needed. |
| Comment by Gerrit Updater [ 07/Jun/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34960/ |
| Comment by Peter Jones [ 07/Jun/19 ] |
|
So what's the verdict from https://review.whamcloud.com/#/c/34961/ ? Is further work needed or can this ticket be marked as RESOLVED? |
| Comment by Mikhail Pershin [ 07/Jun/19 ] |
|
that was alternative approach, I've abandoned it. |
| Comment by Peter Jones [ 07/Jun/19 ] |
|
ok. Should we consider this fix for b2_12? |
| Comment by Gerrit Updater [ 17/Sep/19 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36217 |
| Comment by Gerrit Updater [ 21/Nov/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36217/ |