|
The failure is at:
https://maloo.whamcloud.com/test_sets/485ff720-8c4a-11e0-aab9-52540025f9af
In the mds log:
09:06:50:general protection fault: 0000 [1] SMP
09:06:50:last sysfs file: /block/sdb/queue/max_sectors_kb
09:06:50:CPU 10
09:06:50:Modules linked in: cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) exportfs(U) mgs(U) mgc(U) lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) jbd2(U) crc16(U) be2iscsi(U) ib_iser(U) iscsi_tcp(U) bnx2i(U) cnic(U) cxgb3i(U) libiscsi_tcp(U) libiscsi2(U) scsi_transport_iscsi2(U) scsi_transport_iscsi(U) nfs(U) fscache(U) nfs_acl(U) autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) lockd(U) sunrpc(U) cpufreq_ondemand(U) powernow_k8(U) freq_table(U) mperf(U) uio(U) iw_cxgb3(U) cxgb3(U) ib_srp(U) rds(U) ib_sdp(U) ib_ipoib(U) ipoib_helper(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) rdma_ucm(U) rdma_cm(U) ib_ucm(U) ib_uverbs(U) ib_umad(U) ib_cm(U) iw_cm(U) ib_addr(U) ib_sa(U) loop(U) dm_mirror(U) dm_multipath(U) scsi_dh(U) video(U) backlight(U) sbs(U) power_meter(U) i2c_ec(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) mlx4_ib(U) ib_mad(U) ib_core(U) mlx4_en(U) shpchp(U) sg(U) tpm_tis(U) i2c_piix4(U) mlx4_core(U) igb(U) tpm(U) tpm_bios(U) 8021q(U) pcspkr(U) i2c_core(U) k10temp(U) serio_raw(U) hwmon(U) amd64_edac_mod(U) dca(U) edac_mc(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_log(U) dm_mod(U) dm_mem_cache(U) ahci(U) libata(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
09:06:50:Pid: 6544, comm: obd_zombid Tainted: G 2.6.18-238.9.1.el5_lustre.gc66d831 #1
09:06:50:RIP: 0010:[<ffffffff800620b4>] [<ffffffff800620b4>] __memset+0xa8/0xc0
09:06:50:RSP: 0018:ffff810105881c28 EFLAGS: 00010206
09:06:50:RAX: 5a5a5a5a5a5a5a5a RBX: 5a5a5a5a5a5a5a5a RCX: 000000000000005a
09:06:50:RDX: 0000000000000000 RSI: 000000000000005a RDI: 5a5a5a5a5a5a5a5a
09:06:50:RBP: 0000000000000008 R08: 0000000000000010 R09: 0000000000000002
09:06:50:R10: 5a5a5a5a5a5a5a5a R11: 0000000000000128 R12: ffff81031a312ef8
09:06:50:R13: 0000000000000040 R14: 0000000000000000 R15: 00002ac2f4aa8010
09:06:50:FS: 00002ac2f4aa76e0(0000) GS:ffff810323adc340(0000) knlGS:0000000000000000
09:06:50:CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
09:06:50:CR2: 000000300b29a830 CR3: 0000000000201000 CR4: 00000000000006e0
09:06:50:Process obd_zombid (pid: 6544, threadinfo ffff810105880000, task ffff81041f579820)
09:06:50:Stack: ffffffff88c09435 0000000000000000 0000000000000000 ffffffff88c140e0
09:06:50: ffffffff00000128 5a5a5a5a5a5a5a5a 0000000000000000 ffffffff88c25640
09:06:50: ffffffff8898c912 00000000ffffffff ffff81031a312ef8 ffffffff88c25640
09:06:50:Call Trace:
09:06:50: [<ffffffff88c09435>] :osc:osc_session_fini+0xb5/0xd0
09:06:50: [<ffffffff8898c912>] :obdclass:key_fini+0xd2/0x190
09:06:50: [<ffffffff8898ce25>] :obdclass:lu_context_key_quiesce+0x55/0x80
09:06:50: [<ffffffff8898ce95>] :obdclass:lu_context_key_quiesce_many+0x45/0x80
09:06:50: [<ffffffff8898c24f>] :obdclass:keys_fill+0x4f/0x100
09:06:50: [<ffffffff8898cd34>] :obdclass:lu_context_init+0x1e4/0x200
09:06:50: [<ffffffff88c0915e>] :osc:osc_device_free+0x4e/0x1a0
09:06:50: [<ffffffff8898cd7c>] :obdclass:lu_env_init+0x2c/0x40
09:06:50: [<ffffffff8897006a>] :obdclass:class_decref+0x33a/0x5b0
09:06:50: [<ffffffff88956252>] :obdclass:obd_zombie_impexp_cull+0x402/0x4f0
09:06:50: [<ffffffff8895c08a>] :obdclass:obd_zombie_impexp_thread+0x19a/0x250
09:06:50: [<ffffffff8008e437>] default_wake_function+0x0/0xe
09:06:50: [<ffffffff8005dfb1>] child_rip+0xa/0x11
09:06:50: [<ffffffff8895bef0>] :obdclass:obd_zombie_impexp_thread+0x0/0x250
09:06:50: [<ffffffff8005dfa7>] child_rip+0x0/0x11
09:06:50:
09:06:50:
09:06:50:Code: 48 89 07 49 c7 c0 08 00 00 00 4d 29 c8 4c 01 c7 4d 29 c3 e9
09:06:50:RIP [<ffffffff800620b4>] __memset+0xa8/0xc0
09:06:50: RSP <ffff810105881c28>
09:06:50: <0>Kernel panic - not syncing: Fatal exception
=======================================
I tend to think this is a general issue where the obd device was freed twice.
|