[LU-10450] NULL pointer deref in mdd_changelog_data_store_by_fid+0xfa Created: 03/Jan/18  Updated: 04/Jan/18  Resolved: 04/Jan/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: Lustre 2.11.0

Type: Bug Priority: Major
Reporter: Oleg Drokin Assignee: Sebastien Buisson (Inactive)
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-10454 mdd: NULL pointer dereference in mdd_... Resolved
Related
is related to LU-9727 Lustre Audit with Changelogs Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Seems to be introduced by LU-9727 patch https://review.whamcloud.com/28114

[75525.109249] Lustre: DEBUG MARKER: == sanity test 232a: failed lock should not block umount ============================================= 21:15:03 (1514081703)
[75525.200932] Lustre: *** cfs_fail_loc=31c, val=0***
[75525.201581] LustreError: 11-0: lustre-OST0000-osc-ffff88029158c800: operation ldlm_enqueue to node 0@lo failed: rc = -12
[75526.044646] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[75526.046450] IP: [<ffffffffa124d9ba>] mdd_changelog_data_store_by_fid+0xfa/0x1c0 [mdd]
[75526.047346] PGD 0 
[75526.047757] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[75526.048211] Modules linked in: brd lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) ext4 mbcache loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea ata_generic sysfillrect pata_acpi sysimgblt ttm drm_kms_helper ata_piix i2c_piix4 drm virtio_balloon virtio_console pcspkr i2c_core libata serio_raw virtio_blk floppy nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs]
[75526.053084] CPU: 0 PID: 19445 Comm: mdt00_001 Tainted: P           OE  ------------   3.10.0-debug #2
[75526.053959] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[75526.054557] task: ffff8800a0182600 ti: ffff8802d4fc8000 task.ti: ffff8802d4fc8000
[75526.055830] RIP: 0010:[<ffffffffa124d9ba>]  [<ffffffffa124d9ba>] mdd_changelog_data_store_by_fid+0xfa/0x1c0 [mdd]
[75526.057220] RSP: 0018:ffff8802d4fcbaa0  EFLAGS: 00010246
[75526.057688] RAX: 0000000000000040 RBX: ffff8802d4fcbbf0 RCX: 0000000000000060
[75526.058159] RDX: 0000000000000042 RSI: 0000000000000001 RDI: ffff8802ac9b9f90
[75526.058727] RBP: ffff8802d4fcbae8 R08: ffff88024fe15dc8 R09: ffff8800bb225480
[75526.059197] R10: 0000000000009042 R11: 0000000000000000 R12: ffff8802ac9b9f80
[75526.061825] R13: 0000000000000000 R14: ffff8802ac9b9f90 R15: 0000000000000000
[75526.062445] FS:  0000000000000000(0000) GS:ffff88033e400000(0000) knlGS:0000000000000000
[75526.063454] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[75526.063987] CR2: 0000000000000018 CR3: 00000002a55f6000 CR4: 00000000000006f0
[75526.064553] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[75526.065395] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[75526.066085] Stack:
[75526.066502]  ffff8800bb225480 0000000b00000000 ffff88025241fc00 0000007000009042
[75526.067576]  ffff880304158fa0 ffff8802d4fcbbf0 ffff8800bb225480 0000000000000000
[75526.068786]  0000000000000000 ffff8802d4fcbb08 ffffffffa124ea80 ffff880304158fa0
[75526.069698] Call Trace:
[75526.070168]  [<ffffffffa124ea80>] mdd_changelog_data_store+0xf0/0x220 [mdd]
[75526.070687]  [<ffffffffa124f95b>] mdd_close+0x25b/0xcf0 [mdd]
[75526.071207]  [<ffffffffa12c1b58>] mdt_mfd_close+0x478/0x730 [mdt]
[75526.071709]  [<ffffffffa12904a1>] mdt_obd_disconnect+0x371/0x680 [mdt]
[75526.072335]  [<ffffffffa05c024f>] target_handle_disconnect+0x13f/0x4c0 [ptlrpc]
[75526.073287]  [<ffffffffa065c817>] tgt_disconnect+0x37/0x140 [ptlrpc]
[75526.073869]  [<ffffffffa06651ab>] tgt_request_handle+0x93b/0x13e0 [ptlrpc]
[75526.074395]  [<ffffffffa060a141>] ptlrpc_server_handle_request+0x261/0xaf0 [ptlrpc]
[75526.075333]  [<ffffffffa060def8>] ptlrpc_main+0xa58/0x1df0 [ptlrpc]
[75526.075895]  [<ffffffffa060d4a0>] ? ptlrpc_register_service+0xeb0/0xeb0 [ptlrpc]
[75526.076806]  [<ffffffff810a2eba>] kthread+0xea/0xf0
[75526.077250]  [<ffffffff810a2dd0>] ? kthread_create_on_node+0x140/0x140
[75526.077772]  [<ffffffff8170fb98>] ret_from_fork+0x58/0x90
[75526.078239]  [<ffffffff810a2dd0>] ? kthread_create_on_node+0x140/0x140
[75526.078744] Code: 56 08 4d 8d 74 24 10 49 89 44 24 30 31 c0 45 85 ff 49 89 54 24 38 66 41 89 44 24 10 75 53 be 01 00 00 00 4c 89 f7 e8 76 23 ff ff <41> 8b 55 18 41 8b 75 14 4c 89 f7 e8 a6 23 ff ff 48 8b 0c 24 48 
[75526.080697] RIP  [<ffffffffa124d9ba>] mdd_changelog_data_store_by_fid+0xfa/0x1c0 [mdd]
[75526.081632]  RSP <ffff8802d4fcbaa0>
[75526.082083] CR2: 0000000000000018
(gdb) l *(mdd_changelog_data_store_by_fid+0xfa)
0x219ea is in mdd_changelog_data_store_by_fid (/home/green/git/lustre-release/lustre/mdd/mdd_object.c:675).
670			mdd_changelog_rec_ext_jobid(&rec->cr, uc->uc_jobid);
671
672		if (flags & CLF_EXTRA_FLAGS) {
673			mdd_changelog_rec_ext_extra_flags(&rec->cr, xflags);
674			if (xflags & CLFE_UIDGID)
675				mdd_changelog_rec_extra_uidgid(&rec->cr,
676							       uc->uc_uid, uc->uc_gid);
677		}
678
679		rc = mdd_changelog_store(env, mdd, rec, handle);
(gdb) quit

If we look at this function, we can see this bit of code at the start:

        int xflags = CLFE_INVALID;
...
        flags = (flags & CLF_FLAGMASK) | CLF_VERSION | CLF_EXTRA_FLAGS;
        if (uc != NULL && uc->uc_jobid[0] != '\0')
                flags |= CLF_JOBID;

        xflags |= CLFE_UIDGID;

It looks like we really need to move that xflags assignment into under the if case?

Also should we really be ORin the new flags onto invalid bit, or sohuld that just become a proper assignment?



 Comments   
Comment by James A Simmons [ 03/Jan/18 ]

Yep. I just seen it in my testing as well.

Comment by Peter Jones [ 03/Jan/18 ]

Sebastien

Could you please investigate?

Peter

Generated at Sat Feb 10 02:35:12 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.