[LU-11175] Null pointer dereference in idle_timeout_show recovery-small test 57 Created: 25/Jul/18 Updated: 18/Aug/18 Resolved: 18/Aug/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0 |
| Fix Version/s: | Lustre 2.12.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Oleg Drokin | Assignee: | Alex Zhuravlev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
In code added by https://review.whamcloud.com/32719 for LU-8066 but likely actually due to Some auditing campaign is needed? [ 1930.731848] Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash =================================== 23:58:28 (1532491108) [ 1931.971139] BUG: unable to handle kernel NULL pointer dereference at 0000000000000338 [ 1931.972794] IP: [<ffffffffa075326f>] idle_timeout_show+0x1f/0x30 [osc] [ 1931.973668] PGD 6fb39067 PUD 78ab6067 PMD 0 [ 1931.974182] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 1931.974726] Modules linked in: loop zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) jbd2 mbcache lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) dm_flakey dm_mod libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common rpcsec_gss_krb5 ata_generic pata_acpi ttm drm_kms_helper drm i2c_piix4 ata_piix pcspkr i2c_core virtio_balloon serio_raw virtio_console virtio_blk libata floppy ip_tables [ 1931.979448] CPU: 3 PID: 18210 Comm: lctl Kdump: loaded Tainted: P OE ------------ 3.10.0-7.5-debug #2 [ 1931.980483] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 1931.981001] task: ffff880046fc8800 ti: ffff8800acb88000 task.ti: ffff8800acb88000 [ 1931.981968] RIP: 0010:[<ffffffffa075326f>] [<ffffffffa075326f>] idle_timeout_show+0x1f/0x30 [osc] [ 1931.982980] RSP: 0018:ffff8800acb8bde0 EFLAGS: 00010246 [ 1931.983507] RAX: 0000000000000000 RBX: ffff880070915800 RCX: ffffffffa0753250 [ 1931.984152] RDX: 0000000000000000 RSI: ffffffffa0779433 RDI: ffff88008d32c000 [ 1931.984769] RBP: ffff8800acb8bde0 R08: ffff880079a49738 R09: 0000000000000000 [ 1931.985332] R10: 0000000000001000 R11: 0000000000000000 R12: ffffffffa0372d60 [ 1931.985867] R13: ffff8800acb8bf18 R14: 0000000000000001 R15: ffff880070915800 [ 1931.986431] FS: 00007f1bcb331740(0000) GS:ffff8800bc980000(0000) knlGS:0000000000000000 [ 1931.987404] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1931.987919] CR2: 0000000000000338 CR3: 0000000027d70000 CR4: 00000000000006e0 [ 1931.988488] Call Trace: [ 1931.988980] [<ffffffffa0318ff6>] lustre_attr_show+0x16/0x20 [obdclass] [ 1931.989521] [<ffffffff8129a24c>] sysfs_kf_seq_show+0xcc/0x1e0 [ 1931.990037] [<ffffffff81298953>] kernfs_seq_show+0x23/0x30 [ 1931.990571] [<ffffffff81234bd5>] seq_read+0x115/0x3f0 [ 1931.991072] [<ffffffff8129951d>] kernfs_fop_read+0xfd/0x170 [ 1931.991610] [<ffffffff8120d91c>] vfs_read+0x9c/0x170 [ 1931.992115] [<ffffffff8120e7df>] SyS_read+0x7f/0xf0 [ 1931.992636] [<ffffffff8178383b>] ? system_call_after_swapgs+0xc8/0x160 [ 1931.993166] [<ffffffff817838e9>] system_call_fastpath+0x16/0x1b [ 1931.993742] [<ffffffff8178383b>] ? system_call_after_swapgs+0xc8/0x160 [ 1931.994299] Code: e8 47 07 93 e0 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 d0 48 8b 97 30 f4 ff ff 48 c7 c6 33 94 77 a0 48 89 c7 31 c0 48 89 e5 <8b> 92 38 03 00 00 e8 86 c4 c6 e0 5d 48 98 c3 66 90 0f 1f 44 00 [ 1931.996406] RIP [<ffffffffa075326f>] idle_timeout_show+0x1f/0x30 [osc] (gdb) l *(idle_timeout_show+0x1f)
0xe29f is in idle_timeout_show (/home/green/git/lustre-release/lustre/osc/lproc_osc.c:618).
613 {
614 struct obd_device *obd = container_of(kobj, struct obd_device,
615 obd_kset.kobj);
616 struct client_obd *cli = &obd->u.cli;
617
618 return sprintf(buf, "%u\n", cli->cl_import->imp_idle_timeout);
619 }
(gdb) p/x &((struct obd_import *)0)->imp_idle_timeout
$3 = 0x338
So it looks like cli->cl_import is NULL a |
| Comments |
| Comment by Gerrit Updater [ 26/Jul/18 ] |
|
Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/32883 |
| Comment by Oleg Drokin [ 03/Aug/18 ] |
|
another similar one: [18362.344427] Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash =================================== 20:22:40 (1533255760) [18364.972638] LustreError: 5963:0:(obd_class.h:1075:obd_statfs()) Device 65 not setup [18364.976977] BUG: unable to handle kernel NULL pointer dereference at 0000000000000340 [18364.977482] IP: [<ffffffffa0d160cd>] grant_shrink_show+0x1d/0x40 [osc] [18364.977482] PGD 80000002a2499067 PUD 2d6839067 PMD 0 [18364.977482] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [18364.977482] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) crc_t10dif crct10dif_generic crct10dif_common ata_generic pata_acpi ttm drm_kms_helper ata_piix serio_raw drm virtio_blk i2c_piix4 virtio_balloon virtio_console pcspkr libata i2c_core floppy ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [18364.977482] CPU: 11 PID: 5963 Comm: lctl Kdump: loaded Tainted: P OE ------------ 3.10.0-7.5-debug #1 [18364.977482] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [18364.977482] task: ffff88024775e580 ti: ffff8802f5ab0000 task.ti: ffff8802f5ab0000 [18364.977482] RIP: 0010:[<ffffffffa0d160cd>] [<ffffffffa0d160cd>] grant_shrink_show+0x1d/0x40 [osc] [18364.977482] RSP: 0018:ffff8802f5ab3de0 EFLAGS: 00010246 [18364.977482] RAX: 0000000000000000 RBX: ffff88028b07dd40 RCX: ffffffffa0d160b0 [18364.977482] RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff88030271d000 [18364.977482] RBP: ffff8802f5ab3de0 R08: ffff8802d13e17b8 R09: 0000000000000000 [18364.977482] R10: 0000000000001000 R11: 0000000000000000 R12: ffffffffa0935d60 [18364.977482] R13: ffff8802f5ab3f18 R14: 0000000000000001 R15: ffff88028b07dd40 [18364.977482] FS: 00007f65a2104740(0000) GS:ffff88033dcc0000(0000) knlGS:0000000000000000 [18364.977482] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [18364.977482] CR2: 0000000000000340 CR3: 00000002f7e48000 CR4: 00000000000006e0 [18364.977482] Call Trace: [18364.977482] [<ffffffffa08dbff6>] lustre_attr_show+0x16/0x20 [obdclass] [18364.977482] [<ffffffff8129a1ec>] sysfs_kf_seq_show+0xcc/0x1e0 [18364.977482] [<ffffffff812988f3>] kernfs_seq_show+0x23/0x30 [18364.977482] [<ffffffff81234b75>] seq_read+0x115/0x3f0 [18364.977482] [<ffffffff812994bd>] kernfs_fop_read+0xfd/0x170 [18364.977482] [<ffffffff8120d8bc>] vfs_read+0x9c/0x170 [18364.977482] [<ffffffff8120e77f>] SyS_read+0x7f/0xf0 [18364.977482] [<ffffffff8178387b>] ? system_call_after_swapgs+0xc8/0x160 [18364.977482] [<ffffffff81783929>] system_call_fastpath+0x16/0x1b [18364.977482] [<ffffffff8178387b>] ? system_call_after_swapgs+0xc8/0x160 [18364.977482] Code: eb dd e8 e7 d8 36 e0 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 d0 48 8b 97 30 f4 ff ff be 00 10 00 00 48 89 c7 31 c0 48 89 e5 <48> 8b 8a 40 03 00 00 48 c7 c2 33 d4 d3 a0 48 c1 e9 21 83 e1 01 [18364.977482] RIP [<ffffffffa0d160cd>] grant_shrink_show+0x1d/0x40 [osc] |
| Comment by Oleg Drokin [ 03/Aug/18 ] |
|
Not sure if 100% related but also [ 9344.618480] Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash =================================== 17:52:51 (1533246771) [ 9347.040738] BUG: unable to handle kernel paging request at ffffffff81d8fc50 [ 9347.041687] IP: [<ffffffff810fb0f2>] __pv_queued_spin_lock_slowpath+0x1f2/0x3d0 [ 9347.041687] PGD 1c12067 PUD 1c13063 PMD 3263c1063 PTE 8000000001d8f062 [ 9347.041687] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC [ 9347.041687] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) crc_t10dif crct10dif_generic crct10dif_common ata_generic pata_acpi ttm drm_kms_helper drm i2c_piix4 ata_piix virtio_balloon pcspkr virtio_blk virtio_console serio_raw i2c_core libata floppy ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [ 9347.041687] CPU: 7 PID: 18520 Comm: lctl Kdump: loaded Tainted: P OE ------------ 3.10.0-7.5-debug #1 [ 9347.041687] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 9347.041687] task: ffff88025ce60f40 ti: ffff8802ef004000 task.ti: ffff8802ef004000 [ 9347.041687] RIP: 0010:[<ffffffff810fb0f2>] [<ffffffff810fb0f2>] __pv_queued_spin_lock_slowpath+0x1f2/0x3d0 [ 9347.041687] RSP: 0018:ffff8802ef007d20 EFLAGS: 00010086 [ 9347.041687] RAX: 0000000000008000 RBX: ffff8802a85c5488 RCX: 0000000000000001 [ 9347.041687] RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffffffff81d8fc50 [ 9347.041687] RBP: ffff8802ef007d60 R08: 0000000000000000 R09: 0000000000000000 [ 9347.041687] R10: 0000000000000000 R11: 0000000000000246 R12: ffff88033dbd9c40 [ 9347.041687] R13: ffffffff81d8fc50 R14: ffff88033dbd9c84 R15: 0000000000390000 [ 9347.041687] FS: 00007f820a6bc740(0000) GS:ffff88033dbc0000(0000) knlGS:0000000000000000 [ 9347.041687] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9347.041687] CR2: ffffffff81d8fc50 CR3: 00000002d7d64000 CR4: 00000000000006e0 [ 9347.041687] Call Trace: [ 9347.041687] [<ffffffff813ccc5d>] do_raw_spin_lock+0x6d/0xa0 [ 9347.041687] [<ffffffff817798c0>] _raw_spin_lock_irqsave+0x30/0x40 [ 9347.041687] [<ffffffffa090508d>] lprocfs_stats_lock+0x8d/0xf0 [obdclass] [ 9347.041687] [<ffffffffa090516e>] lprocfs_stats_collect+0x7e/0x140 [obdclass] [ 9347.041687] [<ffffffffa0905aca>] lprocfs_stats_seq_show+0x4a/0x140 [obdclass] [ 9347.041687] [<ffffffff81234b75>] seq_read+0x115/0x3f0 [ 9347.041687] [<ffffffff8120d8bc>] vfs_read+0x9c/0x170 [ 9347.041687] [<ffffffff8120e77f>] SyS_read+0x7f/0xf0 [ 9347.041687] [<ffffffff8178387b>] ? system_call_after_swapgs+0xc8/0x160 [ 9347.041687] [<ffffffff81783929>] system_call_fastpath+0x16/0x1b [ 9347.041687] [<ffffffff8178387b>] ? system_call_after_swapgs+0xc8/0x160 and [ 2246.251618] Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash =================================== 15:55:37 (1533239737) [ 2248.581006] BUG: unable to handle kernel paging request at ffff8802aa9cc188 [ 2248.581006] IP: [<ffffffffa08c21b7>] lprocfs_stats_collect+0xc7/0x140 [obdclass] [ 2248.581006] PGD 23e3067 PUD 33ebfa067 PMD 33eaa5067 PTE 80000002aa9cc060 [ 2248.581006] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 2248.581006] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) libcfs(OE) crc_t10dif crct10dif_generic crct10dif_common ata_generic pata_acpi ttm drm_kms_helper drm i2c_piix4 ata_piix virtio_console pcspkr virtio_balloon serio_raw floppy virtio_blk i2c_core libata ip_tables rpcsec_gss_krb5 [ 2248.581006] CPU: 11 PID: 13358 Comm: lctl Kdump: loaded Tainted: P OE ------------ 3.10.0-7.5-debug #1 [ 2248.581006] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 2248.609087] task: ffff8802b8576ec0 ti: ffff8800a44f0000 task.ti: ffff8800a44f0000 [ 2248.609087] RIP: 0010:[<ffffffffa08c21b7>] [<ffffffffa08c21b7>] lprocfs_stats_collect+0xc7/0x140 [obdclass] [ 2248.609087] RSP: 0018:ffff8800a44f3dc8 EFLAGS: 00010246 [ 2248.609087] RAX: 0000000000000010 RBX: ffff8800a44f3e10 RCX: 0000000000000000 [ 2248.609087] RDX: ffff8802aa9cc168 RSI: 0000000000000000 RDI: 0000000000000000 [ 2248.609087] RBP: ffff8800a44f3df0 R08: 0000000000000168 R09: 0000000000000048 [ 2248.609087] R10: 0000000000000000 R11: ffff8800a44f3c96 R12: ffff88009dc82240 [ 2248.609087] R13: 0000000000000009 R14: ffff8802a84ad000 R15: ffff8802d996f980 [ 2248.609087] FS: 00007f48ff90a740(0000) GS:ffff88033dcc0000(0000) knlGS:0000000000000000 [ 2248.609087] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2248.609087] CR2: ffff8802aa9cc188 CR3: 00000000a444a000 CR4: 00000000000006e0 [ 2248.609087] Call Trace: [ 2248.609087] [<ffffffffa08c2aca>] lprocfs_stats_seq_show+0x4a/0x140 [obdclass] [ 2248.609087] [<ffffffff81234cce>] seq_read+0x26e/0x3f0 [ 2248.609087] [<ffffffff8120d8bc>] vfs_read+0x9c/0x170 [ 2248.609087] [<ffffffff8120e77f>] SyS_read+0x7f/0xf0 [ 2248.609087] [<ffffffff8178387b>] ? system_call_after_swapgs+0xc8/0x160 [ 2248.609087] [<ffffffff81783929>] system_call_fastpath+0x16/0x1b [ 2248.609087] [<ffffffff8178387b>] ? system_call_after_swapgs+0xc8/0x160 |
| Comment by James A Simmons [ 03/Aug/18 ] |
|
Does Alex patch fix the issue? |
| Comment by Oleg Drokin [ 03/Aug/18 ] |
|
only the first one, supposedly. |
| Comment by Oleg Drokin [ 07/Aug/18 ] |
|
Here's another one I hit today: [57676.636894] Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash =================================== 03:55:27 (1533628527) [57681.833491] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [57682.115171] BUG: unable to handle kernel NULL pointer dereference at 00000000000005c8 [57682.116054] IP: [<ffffffff81775ea8>] down_read+0x28/0x50 [57682.116054] PGD 80000000ae72b067 PUD 9c1b0067 PMD 0 [57682.116054] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC [57682.116054] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) dm_flakey dm_mod libcfs(OE) loop zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) jbd2 mbcache crc_t10dif crct10dif_generic crct10dif_common ata_generic pata_acpi ttm drm_kms_helper ata_piix i2c_piix4 drm virtio_balloon pcspkr serio_raw libata virtio_console virtio_blk i2c_core floppy ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [57682.116054] CPU: 5 PID: 8283 Comm: lctl Kdump: loaded Tainted: P OE ------------ 3.10.0-7.5-debug #1 [57682.116054] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [57682.116054] task: ffff8802eca1e640 ti: ffff88009f9a0000 task.ti: ffff88009f9a0000 [57682.116054] RIP: 0010:[<ffffffff81775ea8>] [<ffffffff81775ea8>] down_read+0x28/0x50 [57682.116054] RSP: 0018:ffff88009f9a3db0 EFLAGS: 00010246 [57682.116054] RAX: 00000000000005c8 RBX: 00000000000005c8 RCX: ffff88009f9a3fd8 [57682.116054] RDX: 0000000000000000 RSI: 0000000000000015 RDI: ffffffff81aa4a9b [57682.116054] RBP: ffff88009f9a3db8 R08: ffff8803033b9870 R09: 0000000000000000 [57682.116054] R10: 0000000000001000 R11: 0000000000000000 R12: 00000000000005c8 [57682.116054] R13: ffff880298026000 R14: 0000000000000001 R15: ffff8800af34fd80 [57682.116054] FS: 00007fd1df062740(0000) GS:ffff88033db40000(0000) knlGS:0000000000000000 [57682.116054] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [57682.116054] CR2: 00000000000005c8 CR3: 000000008ccf0000 CR4: 00000000000006e0 [57682.116054] Call Trace: [57682.116054] [<ffffffffa07b0fc4>] active_show+0x24/0x80 [osp] [57682.116054] [<ffffffffa0585ff6>] lustre_attr_show+0x16/0x20 [obdclass] [57682.116054] [<ffffffff8129a1ec>] sysfs_kf_seq_show+0xcc/0x1e0 [57682.116054] [<ffffffff812988f3>] kernfs_seq_show+0x23/0x30 [57682.116054] [<ffffffff81234b75>] seq_read+0x115/0x3f0 [57682.116054] [<ffffffff812994bd>] kernfs_fop_read+0xfd/0x170 [57682.116054] [<ffffffff8120d8bc>] vfs_read+0x9c/0x170 [57682.116054] [<ffffffff8120e77f>] SyS_read+0x7f/0xf0 [57682.116054] [<ffffffff8178387b>] ? system_call_after_swapgs+0xc8/0x160 [57682.116054] [<ffffffff81783929>] system_call_fastpath+0x16/0x1b [57682.116054] [<ffffffff8178387b>] ? system_call_after_swapgs+0xc8/0x160 [57682.116054] Code: 00 00 00 0f 1f 44 00 00 55 31 d2 be 15 00 00 00 48 89 e5 53 48 89 fb 48 c7 c7 9b 4a aa 81 e8 90 70 94 ff e8 6b 11 00 00 48 89 d8 <f0> 48 ff 00 79 05 e8 9d c2 c4 ff 48 83 7b 30 01 74 08 48 c7 43 (gdb) l *(active_show+0x24) 0x19ff4 is in active_show (/home/green/git/lustre-release/lustre/osp/lproc_osp.c:60). 55 dd_kobj); 56 struct lu_device *lu = dt2lu_dev(dt); 57 struct obd_device *obd = lu->ld_obd; 58 int rc; 59 60 LPROCFS_CLIMP_CHECK(obd); 61 rc = sprintf(buf, "%d\n", !obd->u.cli.cl_import->imp_deactive); 62 LPROCFS_CLIMP_EXIT(obd); 63 return rc; 64 } This code is part of https://review.whamcloud.com/32377 from James, I am not sur ewhy it was not hitting before, but it did hit twice today already. |
| Comment by James A Simmons [ 07/Aug/18 ] |
|
That is strange. The point of the LPROC_CLIMP_* macros is to prevent this kind of thing. |
| Comment by Alex Zhuravlev [ 08/Aug/18 ] |
|
I'm not able to reproduce the issue with the patch anymore. |
| Comment by Peter Jones [ 08/Aug/18 ] |
|
Oleg Would the combination of Alex's patch and reverting https://review.whamcloud.com/32377 restore a steady state for you? Peter
|
| Comment by James A Simmons [ 08/Aug/18 ] |
|
Reverting will not address other potential issues. The LPROC_CLIMP_* macros are used for many proc/sysfs files. |
| Comment by James A Simmons [ 09/Aug/18 ] |
|
I started to do a inspect of the code and have found that in several places the cl_import is not protected by the semaphore. I suspect that the idle connection patch that landed exposed this problem. |
| Comment by Gerrit Updater [ 18/Aug/18 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32883/ |
| Comment by Peter Jones [ 18/Aug/18 ] |
|
Landed for 2.12 |