[LU-12491] large-scale: crash in lu_env_remove() RIP: memcmp+0x9/0x50 Created: 29/Jun/19 Updated: 26/Jul/19 Resolved: 12/Jul/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0 |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.12.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | Alex Zhuravlev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
This issue was created by maloo for jianyu <yujian@whamcloud.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/0f90891c-9a35-11e9-b26a-52540065bddc [20823.002620] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds4 [20823.428725] general protection fault: 0000 [#1] SMP [20823.429775] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core sunrpc dm_mod ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev pcspkr virtio_balloon parport_pc parport i2c_piix4 ip_tables ext4 mbcache jbd2 virtio_blk ata_generic pata_acpi crct10dif_pclmul crct10dif_common crc32c_intel serio_raw floppy ata_piix 8139too [20823.443544] libata virtio_pci virtio_ring virtio 8139cp mii [last unloaded: dm_flakey] [20823.444878] CPU: 1 PID: 19396 Comm: mdt_out00_003 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.21.3.el7_lustre.x86_64 #1 [20823.446885] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [20823.447816] task: ffff966f9f1d6180 ti: ffff966f89258000 task.ti: ffff966f89258000 [20823.449023] RIP: 0010:[<ffffffffad380299>] [<ffffffffad380299>] memcmp+0x9/0x50 [20823.450254] RSP: 0018:ffff966f8925bd98 EFLAGS: 00010202 [20823.451115] RAX: 00000000ffffffe0 RBX: 5a5a5a5a5a5a5a5a RCX: 0000000000000008 [20823.452265] RDX: 0000000000000008 RSI: ffff966f8925bdb0 RDI: 5a5a5a5a5a5a5a52 [20823.453414] RBP: ffff966f8925bd98 R08: ffffffffadd589e0 R09: 0000000000000000 [20823.454568] R10: ffff966fbfb80f20 R11: ffffffffffffffec R12: ffff966fb7845000 [20823.455711] R13: fffffffffffffff8 R14: 5a5a5a5a5a5a5a52 R15: 0000000000000000 [20823.456859] FS: 0000000000000000(0000) GS:ffff966fbfd00000(0000) knlGS:0000000000000000 [20823.458155] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [20823.459081] CR2: 00007fd5f82ab630 CR3: 000000005c520000 CR4: 00000000000606e0 [20823.460224] Call Trace: [20823.460790] [<ffffffffc0a4eee7>] lu_env_remove+0x147/0x380 [obdclass] [20823.462141] [<ffffffffc0d3cb10>] ptlrpc_main+0x5f0/0x1560 [ptlrpc] [20823.463174] [<ffffffffad0d09f0>] ? finish_task_switch+0x50/0x1c0 [20823.464199] [<ffffffffc0d3c520>] ? ptlrpc_register_service+0xfa0/0xfa0 [ptlrpc] [20823.465426] [<ffffffffad0c1da1>] kthread+0xd1/0xe0 [20823.466226] [<ffffffffad0c1cd0>] ? insert_kthread_work+0x40/0x40 [20823.467216] [<ffffffffad775c37>] ret_from_fork_nospec_begin+0x21/0x21 [20823.468271] [<ffffffffad0c1cd0>] ? insert_kthread_work+0x40/0x40 [20823.469259] Code: 66 90 55 31 c9 48 85 d2 48 89 f8 48 89 e5 74 0f 66 90 48 89 34 c8 48 83 c1 01 48 39 d1 75 f3 5d c3 90 55 48 85 d2 48 89 e5 74 3c <0f> b6 07 0f b6 0e 29 c8 75 27 48 83 ea 01 31 c9 eb 1a 0f 1f 44 [20823.474428] RIP [<ffffffffad380299>] memcmp+0x9/0x50 [20823.475292] RSP <ffff966f8925bd98> |
| Comments |
| Comment by Alexey Lyashkov [ 04/Jul/19 ] |
|
regression introduced by I think we should don't make release with regressions a specially with panic. |
| Comment by Peter Jones [ 04/Jul/19 ] |
|
Alex Does this seem related to your recent change? Peter |
| Comment by Alex Zhuravlev [ 04/Jul/19 ] |
|
Peter, yes, it is, my bad, I've submitted a patch already. |
| Comment by Peter Jones [ 04/Jul/19 ] |
|
ah I see - https://review.whamcloud.com/#/c/35038/ . I wonder why JIRA did not have a comment about that... |
| Comment by Gerrit Updater [ 08/Jul/19 ] |
|
James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/35447 |
| Comment by Alex Zhuravlev [ 11/Jul/19 ] |
|
Shaun, please clarify did you hit this with the latest or previous version of the patch? |
| Comment by Gerrit Updater [ 12/Jul/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35038/ |
| Comment by Gerrit Updater [ 12/Jul/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35447/ |
| Comment by Peter Jones [ 12/Jul/19 ] |
|
Landed for 2.13 |
| Comment by Gerrit Updater [ 12/Jul/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35487 |
| Comment by Gerrit Updater [ 12/Jul/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35488 |
| Comment by Gerrit Updater [ 26/Jul/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35487/ |
| Comment by Gerrit Updater [ 26/Jul/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35488/ |