[LU-8269] Kernel panic with shared secret key Created: 14/Jun/16 Updated: 13/Sep/16 Resolved: 13/Sep/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Kit Westneat | Assignee: | Jeremy Filizetti |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
I got a kernel panic while running the latest series of SSK patches: [64306.291569] Lustre: 15188:0:(sec_gss.c:2086:gss_svc_handle_init()) create svc ctx ffff8800007b7840: user from 192.168.122.35@tcp authenticated as root
[64306.294194] BUG: unable to handle kernel NULL pointer dereference at (null)
[64306.294332] IP: [<ffffffff812b0cad>] hash_walk_new_entry+0xd/0x50
[64306.295049] PGD 0
[64306.295049] Oops: 0000 [#1] SMP
[64306.295049] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) osc(OE) mdc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) sunrpc mbcache jbd2 sha512_generic crypto_null snd_hda_codec_generic snd_hda_intel crc32_pclmul crc32c_intel snd_hda_codec snd_hda_core ghash_clmulni_intel snd_hwdep snd_seq snd_seq_device snd_pcm ppdev aesni_intel lrw gf128mul glue_helper ablk_helper cryptd snd_timer virtio_balloon pcspkr serio_raw parport_pc snd parport soundcore i2c_piix4 9pnet_virtio(OE) 9p(OE) 9pnet(OE) xfs libcrc32c sd_mod crc_t10dif crct10dif_generic sr_mod cdrom virtio_scsi ata_generic virtio_net virtio_console pata_acpi qxl syscopyarea sysfillrect
[64306.295049] sysimgblt drm_kms_helper ttm crct10dif_pclmul crct10dif_common ata_piix drm i2c_core libata virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod loop [last unloaded: libcfs]
[64306.295049] CPU: 0 PID: 15188 Comm: mdt00_002 Tainted: G OE ------------ 3.10.0-327.13.1.el7_lustre.x86_64 #1
[64306.295049] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
[64306.295049] task: ffff88003dbbdc00 ti: ffff880005d6c000 task.ti: ffff880005d6c000
[64306.295049] RIP: 0010:[<ffffffff812b0cad>] [<ffffffff812b0cad>] hash_walk_new_entry+0xd/0x50
[64306.295049] RSP: 0018:ffff880005d6f9f0 EFLAGS: 00010202
[64306.295049] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000001dc
[64306.295049] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880005d6fa28
[64306.295049] RBP: ffff880005d6f9f0 R08: 0000000000000000 R09: 00000000e0168ddd
[64306.295049] R10: 00000000789f2d9a R11: 00000000ea5dacd6 R12: 0000000000000000
[64306.295049] R13: ffff880005d6fc40 R14: 0000000000000000 R15: 0000000000000000
[64306.295049] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[64306.295049] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[64306.295049] CR2: 0000000000000000 CR3: 000000003c5a8000 CR4: 00000000000406f0
[64306.295049] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[64306.295049] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[64306.295049] Stack:
[64306.295049] ffff880005d6fa18 ffffffff812b0d64 ffff880005d6fc40 0000000000000000
[64306.295049] ffff88000caf46c0 ffff880005d6fa68 ffffffff812b1e99 ffff88001bfeb000
[64306.295049] 0000000000000f38 ffffea00006ffac0 000001dc00000000 0000000000000000
[64306.295049] Call Trace:
[64306.295049] [<ffffffff812b0d64>] crypto_hash_walk_done+0x74/0x110
[64306.295049] [<ffffffff812b1e99>] shash_compat_update+0x59/0x80
[64306.295049] [<ffffffffa0a62ab1>] gss_digest_hmac+0xe1/0x200 [ptlrpc_gss]
[64306.295049] [<ffffffffa0a5fb9f>] sk_make_checksum+0x6f/0xe0 [ptlrpc_gss]
[64306.295049] [<ffffffffa0a61127>] sk_verify_checksum+0xf7/0x6b0 [ptlrpc_gss]
[64306.295049] [<ffffffffa0a6171c>] gss_verify_mic_sk+0x3c/0x40 [ptlrpc_gss]
[64306.295049] [<ffffffffa0a59cfe>] lgss_verify_mic+0x2e/0x100 [ptlrpc_gss]
[64306.295049] [<ffffffffa0a4590a>] gss_verify_msg+0xda/0x1c0 [ptlrpc_gss]
[64306.295049] [<ffffffffa0a49f24>] gss_svc_verify_request+0x124/0x710 [ptlrpc_gss]
[64306.295049] [<ffffffffa0a4e1b2>] gss_svc_handle_data+0x3a2/0xa30 [ptlrpc_gss]
[64306.295049] [<ffffffff811c11ee>] ? kmem_cache_alloc_trace+0x1ce/0x1f0
[64306.295049] [<ffffffffa0a4eb94>] gss_svc_accept+0x354/0xb00 [ptlrpc_gss]
[64306.295049] [<ffffffffa0a635e8>] gss_svc_accept_kr+0x18/0x20 [ptlrpc_gss]
[64306.295049] [<ffffffffa08452ae>] sptlrpc_svc_unwrap_request+0xee/0x610 [ptlrpc]
[64306.295049] [<ffffffffa08262c4>] ptlrpc_main+0x954/0x1db0 [ptlrpc]
[64306.295049] [<ffffffffa0825970>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[64306.295049] [<ffffffff810a5acf>] kthread+0xcf/0xe0
[64306.295049] [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
[64306.295049] [<ffffffff81646018>] ret_from_fork+0x58/0x90
[64306.295049] [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
[64306.295049] Code: 8b 7d d8 4c 01 f0 48 c1 f8 06 48 c1 e0 0c 48 01 d0 eb 93 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 8b 47 20 48 89 e5 <48> 8b 10 48 83 e2 fc 48 89 57 10 8b 50 08 89 57 08 8b 57 1c 8b
[64306.295049] RIP [<ffffffff812b0cad>] hash_walk_new_entry+0xd/0x50
Running that through gdb, gets: (gdb) list *(hash_walk_new_entry+0xd)
0xffffffff812b0cad is in hash_walk_new_entry (include/linux/scatterlist.h:101).
96 {
97 #ifdef CONFIG_DEBUG_SG
98 BUG_ON(sg->sg_magic != SG_MAGIC);
99 BUG_ON(sg_is_chain(sg));
100 #endif
101 return (struct page *)((sg)->page_link & ~0x3);
102 }
103
104 /**
105 * sg_set_buf - Set sg entry to point at given data
I haven't dug into why sg might be null, but I thought I'd post it here to document it. |
| Comments |
| Comment by Andreas Dilger [ 12/Aug/16 ] |
|
Kit, is this still an issue, or has it been fixed with Jeremy's latest patches? |
| Comment by Peter Jones [ 01/Sep/16 ] |
|
Kit? |
| Comment by Kit Westneat [ 01/Sep/16 ] |
|
I am not sure, I'll ask on the IU dev list to see if Jeremy has addressed it. |
| Comment by Peter Jones [ 13/Sep/16 ] |
|
As per the discussion on the PAC call today, closing this ticket out because this issue has not been seen for a long time and changes have been made to the code since it was hit |