Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
After LU-10224 landed that fixed all crashes in that test it seemed like.
Well, I just had a very similar crash in a different place:
[128715.132365] Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash =================================== 10:05:20 (1514387120) [128717.357108] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC [128717.358256] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect sysimgblt ttm ata_generic drm_kms_helper pata_acpi drm floppy i2c_piix4 virtio_console pcspkr virtio_balloon serio_raw virtio_blk ata_piix i2c_core libata nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [128717.371008] CPU: 3 PID: 20280 Comm: lctl Tainted: P OE ------------ 3.10.0-debug #2 [128717.372286] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [128717.372932] task: ffff8802cd528a80 ti: ffff8800a6838000 task.ti: ffff8800a6838000 [128717.382186] RIP: 0010:[<ffffffffa05b8a47>] [<ffffffffa05b8a47>] sptlrpc_ctxs_lprocfs_seq_show+0x27/0x100 [ptlrpc] [128717.383566] RSP: 0018:ffff8800a683be78 EFLAGS: 00010203 [128717.384213] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8802a1586700 RCX: 0000000000000004 [128717.385411] RDX: fffffffffffffff4 RSI: 0000000000000001 RDI: ffffffffa0636425 [128717.389241] RBP: ffff8800a683be90 R08: 0000000000000001 R09: ffff8802f092f000 [128717.390434] R10: 0000000000000000 R11: 0000000000000246 R12: ffff8800a284ef00 [128717.391627] R13: 0000000000000001 R14: ffff8800a683bf48 R15: ffff8800a284ef00 [128717.394109] FS: 00007f93a5430740(0000) GS:ffff88033e460000(0000) knlGS:0000000000000000 [128717.395374] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [128717.396187] CR2: 00007f93a4aa7000 CR3: 0000000095e38000 CR4: 00000000000006e0 [128717.403348] Lustre: Unmounted lustre-client [128717.413399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [128717.414405] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [128717.415645] Stack: [128717.416340] 0000000000000000 ffff880096912e00 0000000000000001 ffff8800a683bf00 [128717.417877] ffffffff81212c85 0000000000001000 0000000001e19af0 ffff8800a284ef38 [128717.419405] 0000000000001000 0000000000000000 ffff880096912e00 0000000001ff0c13 [128717.420901] Call Trace: [128717.421503] [<ffffffff81212c85>] seq_read+0x105/0x3e0 [128717.422165] [<ffffffff811ed1dc>] vfs_read+0x9c/0x170 [128717.422623] [<ffffffff811edd44>] SyS_read+0x84/0xf0 [128717.423149] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b [128717.423877] Code: 1f 44 00 00 0f 1f 44 00 00 55 b9 04 00 00 00 48 89 e5 41 55 41 54 49 89 fc 53 48 8b 9f d8 00 00 00 48 c7 c7 25 64 63 a0 48 8b 03 <4c> 8b 68 40 4c 89 ee f3 a6 75 42 48 8b bb 58 08 00 00 48 85 ff [128717.425876] RIP [<ffffffffa05b8a47>] sptlrpc_ctxs_lprocfs_seq_show+0x27/0x100 [ptlrpc]
(gdb) l *(sptlrpc_ctxs_lprocfs_seq_show+0x27)
0x8ca47 is in sptlrpc_ctxs_lprocfs_seq_show (/home/green/git/lustre-release/lustre/ptlrpc/sec_lproc.c:122).
117 {
118 struct obd_device *dev = seq->private;
119 struct client_obd *cli = &dev->u.cli;
120 struct ptlrpc_sec *sec = NULL;
121
122 LASSERT(strcmp(dev->obd_type->typ_name, LUSTRE_OSC_NAME) == 0 ||
123 strcmp(dev->obd_type->typ_name, LUSTRE_MDC_NAME) == 0 ||
124 strcmp(dev->obd_type->typ_name, LUSTRE_MGC_NAME) == 0 ||
125 strcmp(dev->obd_type->typ_name, LUSTRE_LWP_NAME) == 0 ||
126 strcmp(dev->obd_type->typ_name, LUSTRE_OSP_NAME) == 0);
It's not as frequent as all those othe failures, but still needs to be looked at I guess.