Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for S Buisson <sbuisson@ddn.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/5a6e3921-7a1f-4a28-a682-3d130ac4010b
test_280 failed with the following error:
onyx-34vm4 crashed during sanity test_280
Test session details:
clients: https://build.whamcloud.com/job/lustre-b_es-reviews/22522 - 4.18.0-553.34.1.el8_10.x86_64
servers: https://build.whamcloud.com/job/lustre-b_es-reviews/22522 - 4.18.0-553.34.1.el8_lustre.ddn17.x86_64
Stack trace is:
[10659.920762] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [10659.922219] PGD 80000000304c9067 P4D 80000000304c9067 PUD 5cba067 PMD 0 [10659.923424] Oops: 0000 [#1] SMP PTI [10659.924096] CPU: 1 PID: 1025413 Comm: lgss_keyring Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.34.1.el8_10.x86_64 #1 [10659.926285] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [10659.927335] RIP: 0010:gss_do_ctx_init_rpc+0xa05/0x1220 [ptlrpc_gss] [10659.928575] Code: 89 44 24 08 8b 44 24 64 89 44 24 18 8b 44 24 60 89 44 24 24 e8 5c 9e ec ff 48 89 c1 c6 00 01 48 8b 04 24 48 8b 80 e0 00 00 00 <8b> 40 20 48 c7 41 0c 00 00 00 00 c7 41 20 00 00 00 00 88 41 01 b8 [10659.931808] RSP: 0018:ffffa6fe03fabd70 EFLAGS: 00010282 [10659.932762] RAX: 0000000000000000 RBX: ffff98bcef4ca3c0 RCX: ffff98bd47ec0030 [10659.934040] RDX: 0000000000000024 RSI: 0000000000000000 RDI: 0000000000000030 [10659.935320] RBP: 0000000000000002 R08: 000000000000002c R09: 0000000000000000 [10659.936587] R10: 0000000000000002 R11: 0000000000000000 R12: 00007fff2d5fa120 [10659.937867] R13: ffff98bd47ec0000 R14: 0000000000000000 R15: ffff98bcec8b2550 [10659.939135] FS: 00007fe1c15e7840(0000) GS:ffff98bd7fd00000(0000) knlGS:0000000000000000 [10659.940559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [10659.941607] CR2: 0000000000000020 CR3: 000000000555e003 CR4: 00000000000606e0 [10659.942889] Call Trace: [10659.943407] ? __die_body+0x1a/0x60 [10659.944111] ? no_context+0x1ba/0x3f0 [10659.944813] ? __bad_area_nosemaphore+0x157/0x180 [10659.945687] ? do_page_fault+0x37/0x12d [10659.946424] ? page_fault+0x1e/0x30 [10659.947126] ? gss_do_ctx_init_rpc+0xa05/0x1220 [ptlrpc_gss] [10659.948190] ? __inode_security_revalidate+0x63/0x80 [10659.949126] gss_proc_write_secinit+0x14/0x60 [ptlrpc_gss] [10659.950181] full_proxy_write+0x53/0x80 [10659.950945] vfs_write+0xa5/0x1b0 [10659.951602] ksys_write+0x4f/0xb0 [10659.952248] do_syscall_64+0x5b/0x1a0 [10659.952969] entry_SYSCALL_64_after_hwframe+0x66/0xcb [10659.953912] RIP: 0033:0x7fe1c06d8a15 [10659.954601] Code: 00 00 75 05 48 83 c4 58 c3 e8 f7 49 ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 8b 05 36 da 20 00 85 c0 75 12 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 53 c3 66 90 41 54 49 89 d4 55 48 89 f5 53 89 [10659.957813] RSP: 002b:00007fff2d5fa088 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [10659.959165] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe1c06d8a15 [10659.960437] RDX: 0000000000000050 RSI: 00007fff2d5fa120 RDI: 0000000000000004 [10659.961711] RBP: 00007fff2d5fa120 R08: 00000000012bfbd0 R09: 00000000012b8018 [10659.962991] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff2d5fa090 [10659.964291] R13: 0000000000000004 R14: 0000000000613660 R15: 00007fff2d5fc330 [10659.965560] Modules linked in: lzstd(OE) llz4hc(OE) llz4(OE) lustre(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) tcp_diag inet_diag loop rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache intel_rapl_msr intel_rapl_common crct10dif_pclmul sunrpc crc32_pclmul ghash_clmulni_intel pcspkr joydev virtio_balloon i2c_piix4 ext4 mbcache jbd2 ata_generic ata_piix libata crc32c_intel virtio_net serio_raw virtio_blk net_failover failover [last unloaded: lzstd] [10659.974122] CR2: 0000000000000020
But the crash dump analysis shows the actual problem occurs in ctx_init_pack_request():
ghdr->gh_sp = (__u8) imp->imp_sec->ps_part;
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity test_280 - onyx-34vm4 crashed during sanity test_280