Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.12.0
-
None
-
3
-
9223372036854775807
Description
Seeing this on master-next since Aug 9th or so.
[22965.482820] Lustre: DEBUG MARKER: == sanity test 241a: bio vs dio ====================================================================== 21:23:23 (1534901003) [22975.702174] Lustre: 30392:0:(client.c:2126:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1534901006/real 1534901006] req@ffff880051062c80 x1609459777477808/t0(0) o101->lustre-MDT0000-mdc-ffff8802e756c800@0@lo:12/10 lens 976/44648 e 0 to 1 dl 1534901013 ref 2 fl Rpc:XP/2/ffffffff rc 0/-1 [22975.714859] Lustre: lustre-MDT0000-mdc-ffff8802e756c800: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [22975.718534] Lustre: lustre-MDT0000: Client 883b4996-f586-bc4a-a668-d643202d495f (at 0@lo) reconnecting [23034.421939] BUG: unable to handle kernel NULL pointer dereference at (null) [23034.422839] IP: [<ffffffffa1520714>] vvp_page_delete+0x14/0x140 [lustre] [23034.422839] PGD 80000002aff7c067 PUD 284474067 PMD 0 [23034.422839] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [23034.422839] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) dm_flakey dm_mod brd ext4 loop zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) jbd2 mbcache crc_t10dif crct10dif_generic crct10dif_common ata_generic pata_acpi ttm drm_kms_helper ata_piix drm virtio_balloon libata i2c_piix4 serio_raw pcspkr virtio_console virtio_blk i2c_core floppy ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [23034.422839] CPU: 9 PID: 31804 Comm: dd Kdump: loaded Tainted: P W OE ------------ 3.10.0-7.5-debug #1 [23034.422839] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [23034.422839] task: ffff880270406240 ti: ffff8802adbd4000 task.ti: ffff8802adbd4000 [23034.422839] RIP: 0010:[<ffffffffa1520714>] [<ffffffffa1520714>] vvp_page_delete+0x14/0x140 [lustre] [23034.422839] RSP: 0018:ffff8802adbd78e0 EFLAGS: 00010286 [23034.422839] RAX: ffffea00084c67b8 RBX: ffff8802c6ad3e50 RCX: ffff8802c6ad3e00 [23034.422839] RDX: 0000000000000000 RSI: ffff8802c6ad3e50 RDI: ffff880082fe51f0 [23034.422839] RBP: ffff8802adbd78e0 R08: ffffffffa081dfd8 R09: 0000000000000000 [23034.422839] R10: 0000000000000000 R11: ffff8801e7418e00 R12: ffff8802c6ad3e28 [23034.422839] R13: ffff880082fe51f0 R14: ffff880082fe51f0 R15: 0000000000000000 [23034.422839] FS: 00007f6bda93a740(0000) GS:ffff88033dc40000(0000) knlGS:0000000000000000 [23034.422839] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [23034.422839] CR2: 0000000000000000 CR3: 00000002c3358000 CR4: 00000000000006e0 [23034.422839] Call Trace: [23034.422839] [<ffffffffa039a01d>] cl_page_delete0+0x7d/0x210 [obdclass] [23034.422839] [<ffffffffa039bf6e>] cl_page_alloc+0x15e/0x270 [obdclass] [23034.422839] [<ffffffffa039c107>] cl_page_find+0x87/0x290 [obdclass] [23034.422839] [<ffffffffa14d911d>] ll_dom_finish_open+0x59d/0x830 [lustre] [23034.422839] [<ffffffffa14f7b13>] ? ll_prep_inode+0x223/0xb80 [lustre] [23034.422839] [<ffffffffa05c62e0>] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc] [23034.422839] [<ffffffffa150541a>] ll_lookup_it_finish+0x51a/0xe70 [lustre] [23034.422839] [<ffffffffa0536c75>] ? lmv_intent_lock+0xd05/0x1970 [lmv] [23034.422839] [<ffffffff81117ca2>] ? from_kgid+0x12/0x20 [23034.422839] [<ffffffffa1504b64>] ? ll_i2gids+0x24/0xb0 [lustre] [23034.422839] [<ffffffffa1504850>] ? ll_md_need_convert+0x1b0/0x1b0 [lustre] [23034.422839] [<ffffffffa150604f>] ll_lookup_it+0x2df/0xe00 [lustre] [23034.422839] [<ffffffffa1506ca7>] ll_atomic_open+0x137/0x11f0 [lustre] [23034.422839] [<ffffffff813ccd2b>] ? do_raw_spin_unlock+0x4b/0x90 [23034.422839] [<ffffffff8177943e>] ? _raw_spin_unlock+0xe/0x20 [23034.422839] [<ffffffff8121953b>] ? lookup_dcache+0x8b/0xb0 [23034.422839] [<ffffffff8121e651>] do_last+0xa31/0x12c0 [23034.422839] [<ffffffff81210100>] ? proc_nr_files+0x30/0x30 [23034.422839] [<ffffffff8121efad>] path_openat+0xcd/0x6a0 [23034.422839] [<ffffffff8106dc55>] ? __kernel_map_pages+0xc5/0xd0 [23034.422839] [<ffffffff812209ad>] do_filp_open+0x4d/0xb0 [23034.422839] [<ffffffff813ccd2b>] ? do_raw_spin_unlock+0x4b/0x90 [23034.422839] [<ffffffff8177943e>] ? _raw_spin_unlock+0xe/0x20 [23034.422839] [<ffffffff8122e303>] ? __alloc_fd+0xc3/0x170 [23034.422839] [<ffffffff8120c917>] do_sys_open+0x137/0x240 [23034.422839] [<ffffffff8178386f>] ? system_call_after_swapgs+0xbc/0x160 [23034.422839] [<ffffffff8120ca3e>] SyS_open+0x1e/0x20 [23034.422839] [<ffffffff81783929>] system_call_fastpath+0x16/0x1b [23034.422839] [<ffffffff8178387b>] ? system_call_after_swapgs+0xc8/0x160
Likely due to the recent landing of read during open?
Landed for 2.12