[LU-11031] use after free in lmv_revalidate_slaves Created: 18/May/18 Updated: 18/May/18 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Oleg Drokin | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Hit this for the very first time in master-next today running racer, but it does not have anything changing in lmv, so I think it's something from some earlier landings: [ 1195.412560] BUG: unable to handle kernel paging request at ffff88007c5c5f84 [ 1195.413545] IP: [<ffffffffa0798a4c>] lmv_revalidate_slaves+0x41c/0xbc0 [lmv] [ 1195.414528] PGD 2e75067 PUD 33fc02067 PMD 33fa1f067 PTE 800000007c5c5060 [ 1195.415055] LustreError: 24078:0:(llite_nfs.c:336:ll_dir_get_parent_fid()) lustre: failure inode [0x240000403:0x2da:0x0] get parent: rc = -116 [ 1195.415550] LustreError: 24082:0:(llite_nfs.c:336:ll_dir_get_parent_fid()) lustre: failure inode [0x240000403:0x2da:0x0] get parent: rc = -116 [ 1195.424874] LustreError: 24077:0:(llite_nfs.c:336:ll_dir_get_parent_fid()) lustre: failure inode [0x240000403:0x2da:0x0] get parent: rc = -116 [ 1195.431432] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 1195.437139] LustreError: 24074:0:(llite_nfs.c:336:ll_dir_get_parent_fid()) lustre: failure inode [0x240000403:0x2da:0x0] get parent: rc = -116 [ 1195.432131] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate lquota(OE) lfsck(OE) jbd2 obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) syscopyarea sysfillrect sysimgblt ttm drm_kms_helper ata_generic pata_acpi drm ata_piix i2c_piix4 virtio_blk libata i2c_core virtio_balloon pcspkr virtio_console floppy serio_raw nfsd ip_tables rpcsec_gss_krb5 [ 1195.439887] CPU: 7 PID: 24080 Comm: ls Tainted: P OE ------------ 3.10.0-debug #2 [ 1195.444117] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 1195.444922] task: ffff8802a0a166c0 ti: ffff8802fd98c000 task.ti: ffff8802fd98c000 [ 1195.446732] RIP: 0010:[<ffffffffa0798a4c>] [<ffffffffa0798a4c>] lmv_revalidate_slaves+0x41c/0xbc0 [lmv] [ 1195.448405] RSP: 0018:ffff8802fd98f810 EFLAGS: 00010202 [ 1195.449215] RAX: 0000000000000001 RBX: ffff8802cfff6828 RCX: ffff8802a0a16f90 [ 1195.450105] RDX: 000000000000001a RSI: 0000000000000000 RDI: ffff880260574d00 [ 1195.450812] RBP: ffff8802fd98f8d8 R08: 0000000000000001 R09: 0000000000000000 [ 1195.451505] R10: 0000000000000000 R11: ffff8802a0a16fe8 R12: ffff8802560d5e00 [ 1195.452186] R13: ffff88028c01b800 R14: ffff88007c5c5f80 R15: 0000000000000000 [ 1195.452945] FS: 00007f479c0ed800(0000) GS:ffff88033e4e0000(0000) knlGS:0000000000000000 [ 1195.454182] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1195.458436] CR2: ffff88007c5c5f84 CR3: 0000000266ae5000 CR4: 00000000000006e0 [ 1195.459184] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1195.459886] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1195.460819] Stack: [ 1195.461413] 0000000000000000 ffffffffa14a1e00 ffff8802fd98f878 0000000000000000 [ 1195.462678] ffff88008c290dc0 ffff880298d9ba00 0000000131800ec0 ffff880088885c00 [ 1195.463954] 0000000240000403 00000000000002da 0000000000000008 0000000000000000 [ 1195.465289] Call Trace: [ 1195.466308] [<ffffffffa14a1e00>] ? ll_md_need_convert+0x160/0x160 [lustre] [ 1195.467468] [<ffffffffa0785104>] lmv_merge_attr+0x24/0x190 [lmv] [ 1195.468866] [<ffffffffa035a3b9>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [ 1195.469613] [<ffffffffa14924d0>] ll_update_lsm_md+0xd10/0x1210 [lustre] [ 1195.470353] [<ffffffff811cd4f9>] ? __kmalloc+0x649/0x660 [ 1195.471015] [<ffffffffa079ace0>] ? lmv_fld_lookup+0x180/0x400 [lmv] [ 1195.471768] [<ffffffffa1492d1b>] ll_update_inode+0x34b/0x630 [lustre] [ 1195.472536] [<ffffffffa078683d>] ? lmv_get_lustre_md+0x7d/0x280 [lmv] [ 1195.473215] [<ffffffffa1494e11>] ll_prep_inode+0x121/0xb70 [lustre] [ 1195.473918] [<ffffffffa148d2a5>] ? ll_finish_md_op_data+0x55/0xd0 [lustre] [ 1195.474633] [<ffffffffa147961b>] ll_intent_file_open+0x71b/0x800 [lustre] [ 1195.475340] [<ffffffffa1479935>] ll_file_open+0x235/0xb30 [lustre] [ 1195.476017] [<ffffffffa146207f>] ll_dir_open+0x2f/0xd0 [lustre] [ 1195.476755] [<ffffffff811eadcf>] do_dentry_open+0x1af/0x330 [ 1195.477431] [<ffffffffa1462050>] ? ll_dir_release+0xd0/0xd0 [lustre] [ 1195.478103] [<ffffffff811eb049>] vfs_open+0x39/0x70 [ 1195.478762] [<ffffffff811fcd1d>] do_last+0x1ed/0x12b0 [ 1195.479478] [<ffffffff811fdea2>] path_openat+0xc2/0x4a0 [ 1195.480186] [<ffffffff811ff69b>] do_filp_open+0x4b/0xb0 [ 1195.480930] [<ffffffff817063d7>] ? _raw_spin_unlock+0x27/0x40 [ 1195.481626] [<ffffffff8120d137>] ? __alloc_fd+0xa7/0x130 [ 1195.482287] [<ffffffff811ec553>] do_sys_open+0xf3/0x1f0 [ 1195.482961] [<ffffffff811ec684>] SyS_openat+0x14/0x20 [ 1195.483827] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b [ 1195.500671] Code: 8b 41 18 48 89 da 31 c9 48 8b 74 24 10 4c 89 ef ff 90 c8 00 00 00 8b 74 24 78 85 f6 0f 85 dd 00 00 00 83 44 24 34 01 8b 44 24 34 <41> 3b 46 04 0f 82 ca fc ff ff 48 8b 7c 24 38 48 85 ff 74 05 e8 (gdb) l *(lmv_revalidate_slaves+0x41c)
0x13a4c is in lmv_revalidate_slaves (/home/green/git/lustre-release/lustre/lmv/lmv_intent.c:177).
172
173 /**
174 * Loop over the stripe information, check validity and update them
175 * from MDS if needed.
176 */
177 for (i = 0; i < lsm->lsm_md_stripe_count; i++) {
178 struct lu_fid fid;
179 struct lookup_intent it = { .it_op = IT_GETATTR };
180 struct lustre_handle *lockh = NULL;
181 struct lmv_tgt_desc *tgt = NULL;
It looks like lsm is freed or otherwise invalid. at this point |