Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.1.0, Lustre 2.1.1
-
3
-
4622
Description
We hit a NULL pointer dereference this morning which brought down our dual purpose MDS/OSS node for a 2.1 cluster of ours.
The console reports the following:
2012-02-14 07:01:06 LustreError: 17526:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 07:01:06 LustreError: 17526:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15632468 previous similar messages 2012-02-14 07:11:06 LustreError: 13499:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 07:11:06 LustreError: 13499:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15700830 previous similar messages 2012-02-14 07:21:06 LustreError: 17613:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 07:21:06 LustreError: 17613:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15687688 previous similar messages 2012-02-14 07:30:09 Intel AES-NI instructions are not detected. 2012-02-14 07:30:09 Intel AES-NI instructions are not detected. 2012-02-14 07:30:09 padlock: VIA PadLock not detected. 2012-02-14 07:30:09 BUG: unable to handle kernel NULL pointer dereference at 000000000000004e 2012-02-14 07:30:09 IP: [<ffffffffa04fbc9b>] capa_encrypt_id+0x8b/0x3e0 [obdclass] 2012-02-14 07:30:09 PGD 608f7a067 PUD 62cccd067 PMD 0 2012-02-14 07:30:09 Oops: 0000 [#1] SMP 2012-02-14 07:30:09 last sysfs file: /sys/module/cryptd/initstate 2012-02-14 07:30:09 CPU 5 2012-02-14 07:30:09 Modules linked in: aesni_intel(-) cryptd aes_x86_64 aes_generic cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) mgs(U) obdfilter(U) fsfilt_ldiskfs(U) exportfs ost(U) mgc(U) ldiskfs(U) mbcache jbd2 lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ko2iblnd(U) lnet(U) libcfs(U) ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa mlx4_ib ib_mad ib_core dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm_intel kvm sg sr_mod cdrom mpt2sas scsi_transport_sas raid_class serio_raw i2c_i801 i2c_core ata_generic pata_acpi ata_piix iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core shpchp ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc mlx4_en mlx4_core igb dca [last unloaded: scsi_wait_scan] 2012-02-14 07:30:10 2012-02-14 07:30:10 Pid: 14435, comm: mdt_23 Not tainted 2.6.32-220.4.1.1chaos.ch5.x86_64 #1 Supermicro X8DTH-i/6/iF/6F/X8DTH 2012-02-14 07:30:10 RIP: 0010:[<ffffffffa04fbc9b>] [<ffffffffa04fbc9b>] capa_encrypt_id+0x8b/0x3e0 [obdclass] 2012-02-14 07:30:10 RSP: 0018:ffff8806086f3880 EFLAGS: 00010282 2012-02-14 07:30:10 RAX: fffffffffffffffe RBX: fffffffffffffffe RCX: 0000000000000000 2012-02-14 07:30:10 RDX: 000000000000001c RSI: 0000000000000286 RDI: 0000000000000286 2012-02-14 07:30:10 RBP: ffff8806086f3960 R08: 0000000000000000 R09: ffff88035d73e000 2012-02-14 07:30:10 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000038 2012-02-14 07:30:10 R13: ffff8806086f3990 R14: ffff8805edb45348 R15: ffff8806086f39a0 2012-02-14 07:30:10 FS: 00002aaaab06eb20(0000) GS:ffff88034ac20000(0000) knlGS:0000000000000000 2012-02-14 07:30:10 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 2012-02-14 07:30:10 CR2: 000000000000004e CR3: 00000005ebea9000 CR4: 00000000000006e0 2012-02-14 07:30:10 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2012-02-14 07:30:10 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2012-02-14 07:30:10 Process mdt_23 (pid: 14435, threadinfo ffff8806086f2000, task ffff88060893cb00) 2012-02-14 07:30:10 Stack: 2012-02-14 07:30:10 0000000000000004 0000000000000000 ffff8806086f3900 ffffffff8130b695 2012-02-14 07:30:10 <0> 0000000000000004 ffffffff81af6e08 0000000000000000 ffffffffa0a40000 2012-02-14 07:30:10 <0> ffff8806086f38d0 0000000022e6dabb ffff8806086f3930 0000000000000004 2012-02-14 07:30:10 Call Trace: 2012-02-14 07:30:10 [<ffffffff8130b695>] ? extract_entropy+0xe5/0x140 2012-02-14 07:30:10 [<ffffffffa0a40000>] ? ftrace_raw_event_ldiskfs_mb_release_group_pa+0x50/0xd0 [ldiskfs] 2012-02-14 07:30:10 [<ffffffffa0cb5f0d>] osd_capa_get+0x2cd/0x610 [osd_ldiskfs] 2012-02-14 07:30:10 [<ffffffff8119a417>] ? generic_getxattr+0x87/0x90 2012-02-14 07:30:10 [<ffffffffa0bd8ca0>] mdd_capa_get+0xa0/0x2c0 [mdd] 2012-02-14 07:30:10 [<ffffffffa0ce3ced>] cml_capa_get+0x6d/0x180 [cmm] 2012-02-14 07:30:10 [<ffffffffa0c1e270>] mo_capa_get+0x30/0x70 [mdt] 2012-02-14 07:30:10 [<ffffffffa0c29a81>] mdt_getattr_internal+0x6a1/0xc20 [mdt] 2012-02-14 07:30:10 [<ffffffffa0c2f3a2>] mdt_getattr_name_lock+0xb52/0x1540 [mdt] 2012-02-14 07:30:10 [<ffffffffa065000b>] ? __req_capsule_get+0x15b/0x5a0 [ptlrpc] 2012-02-14 07:30:10 [<ffffffffa062f524>] ? lustre_msg_get_flags+0x34/0x70 [ptlrpc] 2012-02-14 07:30:10 [<ffffffffa0c301dd>] mdt_intent_getattr+0x24d/0x3c0 [mdt] 2012-02-14 07:30:10 [<ffffffffa0c2dda9>] mdt_intent_policy+0x2d9/0x550 [mdt] 2012-02-14 07:30:10 [<ffffffffa0398b6f>] ? cfs_hash_bd_from_key+0x3f/0xc0 [libcfs] 2012-02-14 07:30:10 [<ffffffffa05f6ac2>] ldlm_lock_enqueue+0x272/0x7e0 [ptlrpc] 2012-02-14 07:30:10 [<ffffffffa0615206>] ldlm_handle_enqueue0+0x406/0xd70 [ptlrpc] 2012-02-14 07:30:10 [<ffffffffa0c2d94a>] mdt_enqueue+0x4a/0x100 [mdt] 2012-02-14 07:30:10 [<ffffffffa0c2674d>] mdt_handle_common+0x73d/0x12b0 [mdt] 2012-02-14 07:30:10 [<ffffffffa062f334>] ? lustre_msg_get_transno+0x54/0x90 [ptlrpc] 2012-02-14 07:30:10 [<ffffffffa0c27395>] mdt_regular_handle+0x15/0x20 [mdt] 2012-02-14 07:30:10 [<ffffffffa063b181>] ptlrpc_main+0xcd1/0x1690 [ptlrpc] 2012-02-14 07:30:10 [<ffffffffa063a4b0>] ? ptlrpc_main+0x0/0x1690 [ptlrpc] 2012-02-14 07:30:10 [<ffffffff8100c14a>] child_rip+0xa/0x20 2012-02-14 07:30:10 [<ffffffffa063a4b0>] ? ptlrpc_main+0x0/0x1690 [ptlrpc] 2012-02-14 07:30:10 [<ffffffffa063a4b0>] ? ptlrpc_main+0x0/0x1690 [ptlrpc] 2012-02-14 07:30:10 [<ffffffff8100c140>] ? child_rip+0x0/0x20 2012-02-14 07:30:10 Code: 05 0d dc ea ff 02 0f 85 44 01 00 00 48 8d 7d 80 ba 0f 00 00 00 be 04 00 00 00 e8 51 93 d3 e0 48 85 c0 48 89 c3 0f 84 e3 02 00 00 <48> 8b 40 50 8b 90 e0 00 00 00 41 39 d4 0f 83 a2 00 00 00 c1 e2 2012-02-14 07:30:10 RIP [<ffffffffa04fbc9b>] capa_encrypt_id+0x8b/0x3e0 [obdclass] 2012-02-14 07:30:10 RSP <ffff8806086f3880> 2012-02-14 07:30:10 CR2: 000000000000004e
There was also the following messages every hour for a few hours prior to the crash:
2012-02-14 06:01:06 LustreError: 17526:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 06:01:06 LustreError: 17526:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15581891 previous similar messages 2012-02-14 06:01:06 Feb 14 06:01:06 sumom32 kernel: LsrErr 72::fle_aa.c16fle_uhcp()qoc0040 ocpblt asbe asd<>utero:1560(itr_aac16fle_uhcpa) kpe 5881rvossmlrmsae 2012-02-14 06:11:06 LustreError: 17640:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 06:11:06 LustreError: 17640:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15585479 previous similar messages 2012-02-14 06:21:06 LustreError: 17718:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 06:21:06 LustreError: 17718:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15538316 previous similar messages 2012-02-14 06:31:06 LustreError: 17562:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 06:31:06 LustreError: 17562:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15628102 previous similar messages 2012-02-14 06:41:06 LustreError: 17746:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 06:41:06 LustreError: 17746:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15556085 previous similar messages 2012-02-14 06:51:06 LustreError: 17640:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 06:51:06 LustreError: 17640:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15610175 previous similar messages
2012-02-14 05:01:06 LustreError: 17613:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 05:01:06 LustreError: 17613:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15608069 previous similar messages 2012-02-14 05:01:06 Feb 14 05:01:06 sumom32 kernel: utero:1630(itrca:4:itrat_aa) eq/oc004:n aaiiyhsbe asd<>utero: 71::fle_aac16fle__aa) kpe 5009peiossmlrmsae 2012-02-14 05:11:06 LustreError: 17560:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 05:11:06 LustreError: 17526:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 05:11:06 LustreError: 17526:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15645006 previous similar messages 2012-02-14 05:11:06 LustreError: 17560:0:(filter_capa.c:146:filter_auth_capa()) Skipped 671 previous similar messages 2012-02-14 05:21:06 LustreError: 17526:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 05:21:06 LustreError: 17526:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15594099 previous similar messages 2012-02-14 05:31:06 LustreError: 17618:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 05:31:06 LustreError: 17618:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15582323 previous similar messages 2012-02-14 05:31:06 Feb 14 05:31:06 sumom32 kernel: utero:1560(itrcp.:4:itrat_aa) kpe 5406peiu iia esgs<>utero:1500:(itrcp.:4:itrat_aa) kpe 7 rvossml esgs3>utero:1680(itrcpa.:4:itrat_aa) e/p /x0 ocpblt a enpse 2012-02-14 05:41:06 LustreError: 17802:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 05:41:06 LustreError: 17802:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15637350 previous similar messages 2012-02-14 05:51:06 LustreError: 13417:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 05:51:06 LustreError: 13417:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15638302 previous similar messages
2012-02-14 04:01:06 LustreError: 17718:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 04:01:06 LustreError: 17718:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15621761 previous similar messages 2012-02-14 04:11:06 LustreError: 17560:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 04:11:06 LustreError: 17560:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15658332 previous similar messages 2012-02-14 04:21:06 LustreError: 13466:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 04:21:06 LustreError: 13466:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15618610 previous similar messages 2012-02-14 04:31:06 LustreError: 17467:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 04:31:06 LustreError: 17467:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15664435 previous similar messages 2012-02-14 04:41:06 LustreError: 17526:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 04:41:06 LustreError: 17526:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15568067 previous similar messages 2012-02-14 04:51:06 LustreError: 17467:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed 2012-02-14 04:51:06 LustreError: 17467:0:(filter_capa.c:146:filter_auth_capa()) Skipped 15626923 previous similar messages
Attachments
Issue Links
- Trackbacks
-
Changelog 2.1 Changes from version 2.1.1 to version 2.1.2 Server support for kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.17.1.el6 (RHEL6) Client support for unpatched kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.17.1....