Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
Lustre 2.13.0
-
None
-
3
-
9223372036854775807
Description
Seems to be the other failure mode to LU-11998 because it only happens in sanity test 411.
[277274.440468] Lustre: DEBUG MARKER: == sanity test 411: Slab allocation error with cgroup does not LBUG ================================== 06:47:26 (1550922446) [277279.569543] SLAB: Unable to allocate memory on node 0 (gfp=0x100050) [277279.593403] cache: kmalloc-512(0:osc_slab_alloc), object size: 4096, order: 0 [277279.595819] node 0: slabs: 37/37, objs: 37/37, free: 0 [277315.514158] SLAB: Unable to allocate memory on node 0 (gfp=0x100050) [277315.515355] cache: kmalloc-512(0:osc_slab_alloc), object size: 4096, order: 0 [277315.551203] node 0: slabs: 67/67, objs: 67/67, free: 0 [277315.581812] SLAB: Unable to allocate memory on node 0 (gfp=0x100050) [277315.582961] cache: kmalloc-512(0:osc_slab_alloc), object size: 4096, order: 0 [277315.585015] node 0: slabs: 67/67, objs: 67/67, free: 0 [277315.607385] SLAB: Unable to allocate memory on node 0 (gfp=0x100050) [277315.608453] cache: kmalloc-512(0:osc_slab_alloc), object size: 4096, order: 0 [277315.610066] node 0: slabs: 67/67, objs: 67/67, free: 0 [277315.641346] SLAB: Unable to allocate memory on node 0 (gfp=0x100050) [277315.642642] cache: kmalloc-512(0:osc_slab_alloc), object size: 4096, order: 0 [277315.645272] node 0: slabs: 67/67, objs: 67/67, free: 0 [277321.431039] SLAB: Unable to allocate memory on node 0 (gfp=0x100050) [277321.432413] cache: osc_session_kmem(0:osc_slab_alloc), object size: 4096, order: 0 [277321.434844] node 0: slabs: 1/1, objs: 1/1, free: 0 [277321.436257] BUG: unable to handle kernel NULL pointer dereference at 0000000000000024 [277321.438634] IP: [<ffffffffa036a566>] osc_io_init+0x16/0x140 [osc] [277321.439944] PGD 1b3ea067 PUD 1b3e9067 PMD 0 [277321.441187] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [277321.442441] Modules linked in: dm_flakey dm_mod lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) brd ext4 loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) jbd2 mbcache crc_t10dif crct10dif_generic crct10dif_common virtio_balloon virtio_console pcspkr i2c_piix4 binfmt_misc ip_tables rpcsec_gss_krb5 ata_generic pata_acpi drm_kms_helper ttm drm drm_panel_orientation_quirks ata_piix i2c_core virtio_blk serio_raw libata floppy [last unloaded: obdecho] [277321.456256] CPU: 0 PID: 10704 Comm: dd Kdump: loaded Tainted: P W OE ------------ 3.10.0-7.6-debug #2 [277321.458692] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [277321.459888] task: ffff880081692900 ti: ffff88001732c000 task.ti: ffff88001732c000 [277321.461944] RIP: 0010:[<ffffffffa036a566>] [<ffffffffa036a566>] osc_io_init+0x16/0x140 [osc] [277321.463943] RSP: 0018:ffff88001732fa30 EFLAGS: 00010286 [277321.465068] RAX: ffffffffa036a550 RBX: ffff88005e98ce60 RCX: ffff8800692e9348 [277321.466983] RDX: ffff8800723ece80 RSI: ffffffffa038eec0 RDI: fffffffffffffff4 [277321.468993] RBP: ffff88001732fa40 R08: 0000000000000001 R09: ffff8800254a69c8 [277321.470986] R10: ffff8800254a6000 R11: ffff8800254a69c0 R12: ffff88005e98ce60 [277321.472890] R13: fffffffffffffff4 R14: ffff8800723ece80 R15: ffff88009eea8e38 [277321.474888] FS: 00007fa780b3a740(0000) GS:ffff8800bc800000(0000) knlGS:0000000000000000 [277321.476915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [277321.478684] CR2: 0000000000000024 CR3: 00000000a6b3c000 CR4: 00000000000006f0 [277321.480711] Call Trace: [277321.481707] [<ffffffffa0544468>] cl_io_init0.isra.15+0x88/0x160 [obdclass] [277321.483127] [<ffffffffa054457a>] cl_io_sub_init+0x3a/0x80 [obdclass] [277321.484425] [<ffffffffa048cc52>] lov_sub_get+0x2b2/0x7e0 [lov] [277321.485664] [<ffffffffa04a3321>] ? lov_stripe_intersects+0xa1/0x170 [lov] [277321.487016] [<ffffffffa048eb1b>] lov_io_iter_init+0x26b/0x950 [lov] [277321.488314] [<ffffffffa048f578>] lov_io_rw_iter_init+0x1a8/0x520 [lov] [277321.489448] [<ffffffffa054406c>] cl_io_iter_init+0x5c/0x120 [obdclass] [277321.490537] [<ffffffffa0546192>] cl_io_loop+0x42/0x1c0 [obdclass] [277321.491626] [<ffffffffa14adab0>] ll_file_io_generic+0x590/0xcb0 [lustre] [277321.492880] [<ffffffffa14af028>] ll_file_aio_read+0x2c8/0x3e0 [lustre] [277321.494200] [<ffffffffa14af1e4>] ll_file_read+0xa4/0x170 [lustre] [277321.495481] [<ffffffff8123612c>] vfs_read+0x9c/0x170 [277321.496696] [<ffffffff81236fcf>] SyS_read+0x7f/0xf0 [277321.497935] [<ffffffff817c4d61>] ? system_call_after_swapgs+0xae/0x146 [277321.499244] [<ffffffff817c4e15>] system_call_fastpath+0x1c/0x21 [277321.500512] [<ffffffff817c4d61>] ? system_call_after_swapgs+0xae/0x146
0x1b596 is in osc_io_init (/home/green/git/lustre-release/lustre/include/lustre_osc.h:747). 742 743 static inline struct osc_session *osc_env_session(const struct lu_env *env) 744 { 745 struct osc_session *ses; 746 747 ses = lu_context_key_get(env->le_ses, &osc_session_key); 748 LASSERT(ses != NULL); 749 return ses; 750 }
So it sounds like env is NULL, but I don't readily see how it could be NULL there.
Also le_ses is at 0x30 in my tree it looks like, not 0x24... though... 0x30-0x24 = 12, so if we assume env is -12 (ENOMEM)...
the check in ll_file_read_iter seem to be all correct, so unless it was substituted somewhere along the path, it's a bit of a mystery.
This this today, so it's definitely an ongoing issue.
Attachments
Issue Links
- is duplicated by
-
LU-12436 Memory allocation failure error dropped
- Resolved