Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
None
-
None
-
None
-
3
-
6327
Description
This issue was created by maloo for yujian <yujian@whamcloud.com>
This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/2121f7fa-fd8a-11e1-afe5-52540035b04c.
Info required for matching: large-scale 3a
Lustre Build: http://build.whamcloud.com/job/lustre-b2_3/17
Console log on MDS (fat-intel-2):
Lustre: DEBUG MARKER: lctl get_param -n *.lustre-MDT0000.recovery_status BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 IP: [<ffffffff8150057e>] _spin_lock+0xe/0x30 PGD 0 Oops: 0002 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu23/cache/index2/shared_cpu_map CPU 13 Modules linked in: nfs fscache cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) jbd2 lustre(U) lquota(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ------------[ cut here ]------------ WARNING: at kernel/sched_fair.c:132 load_balance_next_fair+0x6a/0x80() (Not tainted) Hardware name: X8DTT-H Modules linked in: nfs fscache cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) jbd2 lustre(U) lquota(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en mlx4_core e1000e microcode serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 0, comm: swapper Not tainted 2.6.32-279.5.1.el6_lustre.g634f764.x86_64 #1 Call Trace: <IRQ> [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20 [<ffffffff8105f1ba>] ? load_balance_next_fair+0x6a/0x80 [<ffffffff8105fbc8>] ? load_balance_fair+0x208/0x2f0 [<ffffffff8106052b>] ? rebalance_domains+0x27b/0x5a0 [<ffffffff810a21d0>] ? tick_sched_timer+0x0/0xc0 [<ffffffff8106089c>] ? run_rebalance_domains+0x4c/0x160 [<ffffffff8102b40d>] ? lapic_next_event+0x1d/0x30 [<ffffffff81073ec1>] ? __do_softirq+0xc1/0x1e0 [<ffffffff81096c50>] ? hrtimer_interrupt+0x140/0x250 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0 [<ffffffff81073ca5>] ? irq_exit+0x85/0x90 [<ffffffff81506050>] ? smp_apic_timer_interrupt+0x70/0x9b [<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20 <EOI> [<ffffffff812cdd0e>] ? intel_idle+0xde/0x170 [<ffffffff812cdcf1>] ? intel_idle+0xc1/0x170 [<ffffffff8109914d>] ? sched_clock_cpu+0xcd/0x110 [<ffffffff81407a97>] ? cpuidle_idle_call+0xa7/0x140 [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110 [<ffffffff814f714f>] ? start_secondary+0x22a/0x26d ---[ end trace cc917bd58c42f280 ]--- ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en mlx4_core e1000e microcode serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 0, comm: swapper Tainted: G W --------------- 2.6.32-279.5.1.el6_lustre.g634f764.x86_64 #1 Supermicro X8DTT-H/X8DTT-H RIP: 0010:[<ffffffff8150057e>] [<ffffffff8150057e>] _spin_lock+0xe/0x30 RSP: 0018:ffff8800282e3860 EFLAGS: 00010002 RAX: 0000000000010000 RBX: 000000000000adcc RCX: ffff880630129400 RDX: 00c0000000000080 RSI: 0000000000000000 RDI: 0000000000000040 RBP: ffff8800282e3860 R08: 0000000000000001 R09: 0000000003dcf503 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000286 R14: ffff88033fcd0340 R15: 0000000000009dc0 FS: 0000000000000000(0000) GS:ffff8800282e0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000040 CR3: 000000033052b000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffff880637cfe000, task ffff880337e2a040) Stack: ffff8800282e38d0 ffffffff811645f3 ffff8800282e3920 ffffffff81476fc0 <d> 0000000b282e38a0 0000000000000001 ffffea0014c1fc00 ffff880630129400 <d> ffff8800282e38f0 0000000000000001 ffff880615472880 ffff880615472880 Call Trace: <IRQ> [<ffffffff811645f3>] kfree+0x2d3/0x320 [<ffffffff81476fc0>] ? ip_queue_xmit+0x190/0x420 [<ffffffff81430368>] skb_release_data+0xd8/0x110 [<ffffffff8142fe9e>] __kfree_skb+0x1e/0xa0 [<ffffffff81488f14>] tcp_ack+0x3b4/0x1280 [<ffffffff8148bd0e>] ? tcp_transmit_skb+0x3fe/0x7b0 [<ffffffff8150076b>] ? _spin_unlock_bh+0x1b/0x20 [<ffffffffa08166c7>] ? ksocknal_read_callback+0x47/0x190 [ksocklnd] [<ffffffff8148a1bd>] tcp_rcv_established+0x3dd/0x800 [<ffffffff814921b3>] tcp_v4_do_rcv+0x2e3/0x430 [<ffffffff81493a2e>] tcp_v4_rcv+0x4fe/0x8d0 [<ffffffff8142fec7>] ? __kfree_skb+0x47/0xa0 [<ffffffff8147174d>] ip_local_deliver_finish+0xdd/0x2d0 [<ffffffff814719d8>] ip_local_deliver+0x98/0xa0 [<ffffffff81470e9d>] ip_rcv_finish+0x12d/0x440 [<ffffffff81471425>] ip_rcv+0x275/0x350 [<ffffffff8143ac2b>] __netif_receive_skb+0x49b/0x6f0 [<ffffffff8143cea8>] netif_receive_skb+0x58/0x60 [<ffffffff8143cfb0>] napi_skb_finish+0x50/0x70 � [<ffffffff8143f4e9>] napi_gro_receive+0x39/0x50 [<ffffffffa014220b>] e1000_receive_skb+0x5b/0x90 [e1000e] [<ffffffffa0145fa0>] e1000_clean_rx_irq+0x380/0x580 [e1000e] [<ffffffffa01447c5>] e1000_clean+0xb5/0x2c0 [e1000e] [<ffffffff8143f603>] net_rx_action+0x103/0x2f0 [<ffffffff81073ec1>] __do_softirq+0xc1/0x1e0 [<ffffffff810db930>] ? handle_IRQ_event+0x60/0x170 [<ffffffff81073f1f>] ? __do_softirq+0x11f/0x1e0 [<ffffffff8100c24c>] call_softirq+0x1c/0x30 [<ffffffff8100de85>] do_softirq+0x65/0xa0 [<ffffffff81073ca5>] irq_exit+0x85/0x90 [<ffffffff81505f65>] do_IRQ+0x75/0xf0 [<ffffffff8100ba53>] ret_from_intr+0x0/0x11 <EOI> [<ffffffff812cdd0e>] ? intel_idle+0xde/0x170 [<ffffffff812cdcf1>] ? intel_idle+0xc1/0x170 [<ffffffff81407a97>] cpuidle_idle_call+0xa7/0x140 [<ffffffff81009e06>] cpu_idle+0xb6/0x110 [<ffffffff814f714f>] start_secondary+0x22a/0x26d Code: e5 0f 1f 44 00 00 fa 66 0f 1f 44 00 00 f0 81 2f 00 00 00 01 74 05 e8 b2 e3 d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 <f0> 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 eb f5 RIP [<ffffffff8150057e>] _spin_lock+0xe/0x30 RSP <ffff8800282e3860> CR2: 0000000000000040
Attachments
Issue Links
- duplicates
-
LU-1881 sanity test 116 soft lockup
- Resolved