[LU-11242] parallel-scale-nfsv4 test racer_on_nfs crashes with “BUG: unable to handle kernel NULL pointer dereference” Created: 13/Aug/18 Updated: 13/Aug/18 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
parallel-scale-nfsv4 test_racer_on_nfs started crashing, with this particular stack trace, on August 10, 2018 for Lustre version 2.11.53.66 for build 3776. Although parallel-scale-nfsv4 racer_on_nfs is crashing the kernel many times recently, here are logs for two instances of this crash with similar stack traces. For https://testing.whamcloud.com/test_sets/c27435ee-9cdc-11e8-a9f7-52540065bddc, looking at the kernel-crash log, we see [ 5526.062692] BUG: unable to handle kernel NULL pointer dereference at 000000000000000d [ 5526.063900] IP: [<ffffffff8a218dee>] vfs_open+0x1e/0xb0 [ 5526.064488] PGD 800000006e355067 PUD 6dfac067 PMD 0 [ 5526.065162] Oops: 0000 [#1] SMP [ 5526.065621] Modules linked in: nfsv3 nfs_acl mgc(OE) lustre(OE) lmv(OE) mdc(OE) fid(OE) osc(OE) lov(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core sunrpc ppdev i2c_piix4 i2c_core parport_pc parport joydev pcspkr virtio_balloon iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix libata virtio_blk 8139too crct10dif_pclmul crct10dif_common crc32c_intel serio_raw virtio_pci 8139cp [ 5526.074448] virtio_ring virtio mii floppy [ 5526.074806] CPU: 1 PID: 1025 Comm: lfs Kdump: loaded Tainted: G OE ------------ 3.10.0-862.9.1.el7.x86_64 #1 [ 5526.075825] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 5526.076366] task: ffff9a97edfecf10 ti: ffff9a97edc60000 task.ti: ffff9a97edc60000 [ 5526.077060] RIP: 0010:[<ffffffff8a218dee>] [<ffffffff8a218dee>] vfs_open+0x1e/0xb0 [ 5526.077842] RSP: 0018:ffff9a97edc63cb0 EFLAGS: 00010282 [ 5526.078382] RAX: ffff9a97edfecf10 RBX: ffff9a97ee281500 RCX: 000000000000021f [ 5526.079044] RDX: ffff9a97fb5e0780 RSI: 0000000000000000 RDI: ffff9a97e39139c0 [ 5526.079819] RBP: ffff9a97edc63cc8 R08: ffffffffc05860a0 R09: ffff9a97fd001a00 [ 5526.080548] R10: ffffffffc05616b2 R11: ffffe0a8c0d88680 R12: ffff9a97edc63e10 [ 5526.081211] R13: ffff9a97edc63db0 R14: 0000000000000000 R15: ffff9a97edc63e10 [ 5526.081880] FS: 00007f35498ac740(0000) GS:ffff9a97ffd00000(0000) knlGS:0000000000000000 [ 5526.082640] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5526.083231] CR2: 000000000000000d CR3: 000000006e358000 CR4: 00000000000606e0 [ 5526.083982] Call Trace: [ 5526.084270] [<ffffffff8a227288>] ? may_open+0x68/0x120 [ 5526.084855] [<ffffffff8a22b2dd>] do_last+0x1ed/0x12c0 [ 5526.085407] [<ffffffff8a22c487>] path_openat+0xd7/0x640 [ 5526.085931] [<ffffffff8a22e01d>] do_filp_open+0x4d/0xb0 [ 5526.086473] [<ffffffff8a23b4c4>] ? __alloc_fd+0xc4/0x170 [ 5526.087082] [<ffffffff8a21a327>] do_sys_open+0x137/0x240 [ 5526.087642] [<ffffffff8a7206d5>] ? system_call_after_swapgs+0xa2/0x146 [ 5526.088313] [<ffffffff8a21a44e>] SyS_open+0x1e/0x20 [ 5526.088823] [<ffffffff8a720795>] system_call_fastpath+0x1c/0x21 [ 5526.089390] [<ffffffff8a7206e1>] ? system_call_after_swapgs+0xae/0x146 [ 5526.090006] Code: 83 0b 02 5b 5d c3 0f 0b 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 41 54 49 89 fc 53 48 89 f3 48 83 ec 08 48 8b 7f 08 48 8b 77 30 <f6> 46 0d 20 75 44 f7 07 00 00 00 08 8b 73 40 75 5f 48 81 ff 00 [ 5526.093315] RIP [<ffffffff8a218dee>] vfs_open+0x1e/0xb0 [ 5526.093871] RSP <ffff9a97edc63cb0> [ 5526.094206] CR2: 000000000000000d For https://testing.whamcloud.com/test_sets/24be5d06-9d91-11e8-87f3-52540065bddc, looking at the kernel-crash log, we see [55978.629272] 11[16997]: segfault at 8 ip 00007fa86a3f5958 sp 00007ffce54dfbe0 error 4 in ld-2.17.so[7fa86a3ea000+22000] [56078.517576] 1[22218]: segfault at 8 ip 00007feee9ecd958 sp 00007ffd5dfa4570 error 4 in ld-2.17.so[7feee9ec2000+22000] [56080.045148] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [56080.046060] IP: [<ffffffff8a6261f6>] put_link+0x16/0x40 [56080.082670] PGD 800000006d0cf067 PUD 53b7c067 PMD 0 [56080.083675] Oops: 0000 [#1] SMP [56080.093235] Modules linked in: nfsv3 nfs_acl mgc(OE) lustre(OE) lmv(OE) mdc(OE) fid(OE) osc(OE) lov(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core sunrpc iosf_mbi crc32_pclmul ghash_clmulni_intel ppdev aesni_intel i2c_piix4 lrw gf128mul glue_helper ablk_helper cryptd i2c_core joydev pcspkr virtio_balloon parport_pc parport ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix virtio_blk libata crct10dif_pclmul crct10dif_common 8139too crc32c_intel serio_raw virtio_pci virtio_ring [56080.107051] virtio 8139cp mii floppy [last unloaded: lnet_selftest] [56080.108039] CPU: 0 PID: 25758 Comm: getfattr Kdump: loaded Tainted: G OE ------------ 3.10.0-862.9.1.el7.x86_64 #1 [56080.109841] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [56080.110754] task: ffff8d626bc5bf40 ti: ffff8d626d004000 task.ti: ffff8d626d004000 [56080.111929] RIP: 0010:[<ffffffff8a6261f6>] [<ffffffff8a6261f6>] put_link+0x16/0x40 [56080.113185] RSP: 0018:ffff8d626d007d60 EFLAGS: 00010292 [56080.114036] RAX: ffff8d6269746d80 RBX: ffff8d626d007dc0 RCX: 0000000000000000 [56080.115154] RDX: ffffe9c2c1e17780 RSI: ffff8d626d007dc0 RDI: ffff8d6269746d80 [56080.116284] RBP: ffff8d626d007d68 R08: ffff8d626d007e10 R09: ffff8d627d001b00 [56080.117401] R10: ffffffffc05a9bec R11: ffffe9c2c0d986c0 R12: ffff8d621263ba00 [56080.118521] R13: ffff8d6235c2f000 R14: ffff8d626d007ef4 R15: 0000000000000000 [56080.119666] FS: 00007f953b835740(0000) GS:ffff8d627fc00000(0000) knlGS:0000000000000000 [56080.120937] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [56080.121849] CR2: 000000000000a000 CR3: 000000006b144000 CR4: 00000000000606f0 [56080.123009] Call Trace: [56080.124003] [<ffffffff8a62c807>] path_openat+0x457/0x640 [56080.127426] [<ffffffff8a62e01d>] do_filp_open+0x4d/0xb0 [56080.128316] [<ffffffff8a63b4c4>] ? __alloc_fd+0xc4/0x170 [56080.129879] [<ffffffff8a61a327>] do_sys_open+0x137/0x240 [56080.132283] [<ffffffff8ab206d5>] ? system_call_after_swapgs+0xa2/0x146 [56080.133358] [<ffffffff8a61a464>] SyS_openat+0x14/0x20 [56080.134280] [<ffffffff8ab20795>] system_call_fastpath+0x1c/0x21 [56080.135278] [<ffffffff8ab206e1>] ? system_call_after_swapgs+0xae/0x146 [56080.136318] Code: 7b 01 00 5b 5d c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 55 49 89 f8 48 89 e5 53 48 8b 46 08 48 89 f3 48 8b 48 30 48 89 c7 <48> 8b 49 20 48 8b 49 28 48 85 c9 74 0c 4c 89 c6 e8 35 4e 13 00 [56080.141449] RIP [<ffffffff8a6261f6>] put_link+0x16/0x40 [56080.142331] RSP <ffff8d626d007d60> [56080.142900] CR2: 0000000000000020 |