Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1927

large-scale subtest test_3a: Oops: RIP: _spin_lock+0xe/0x30

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • None
    • None
    • 3
    • 6327

    Description

      This issue was created by maloo for yujian <yujian@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/2121f7fa-fd8a-11e1-afe5-52540035b04c.

      Info required for matching: large-scale 3a

      Lustre Build: http://build.whamcloud.com/job/lustre-b2_3/17

      Console log on MDS (fat-intel-2):

      Lustre: DEBUG MARKER: lctl get_param -n *.lustre-MDT0000.recovery_status
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
      IP: [<ffffffff8150057e>] _spin_lock+0xe/0x30
      PGD 0 
      Oops: 0002 [#1] SMP 
      last sysfs file: /sys/devices/system/cpu/cpu23/cache/index2/shared_cpu_map
      CPU 13 
      Modules linked in: nfs fscache cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) jbd2 lustre(U) lquota(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm
      ------------[ cut here ]------------
      WARNING: at kernel/sched_fair.c:132 load_balance_next_fair+0x6a/0x80() (Not tainted)
      Hardware name: X8DTT-H
      Modules linked in: nfs fscache cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) jbd2 lustre(U) lquota(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en mlx4_core e1000e microcode serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
      Pid: 0, comm: swapper Not tainted 2.6.32-279.5.1.el6_lustre.g634f764.x86_64 #1
      Call Trace:
       <IRQ>  [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0
       [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20
       [<ffffffff8105f1ba>] ? load_balance_next_fair+0x6a/0x80
       [<ffffffff8105fbc8>] ? load_balance_fair+0x208/0x2f0
       [<ffffffff8106052b>] ? rebalance_domains+0x27b/0x5a0
       [<ffffffff810a21d0>] ? tick_sched_timer+0x0/0xc0
       [<ffffffff8106089c>] ? run_rebalance_domains+0x4c/0x160
       [<ffffffff8102b40d>] ? lapic_next_event+0x1d/0x30
       [<ffffffff81073ec1>] ? __do_softirq+0xc1/0x1e0
       [<ffffffff81096c50>] ? hrtimer_interrupt+0x140/0x250
       [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
       [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
       [<ffffffff81073ca5>] ? irq_exit+0x85/0x90
       [<ffffffff81506050>] ? smp_apic_timer_interrupt+0x70/0x9b
       [<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20
       <EOI>  [<ffffffff812cdd0e>] ? intel_idle+0xde/0x170
       [<ffffffff812cdcf1>] ? intel_idle+0xc1/0x170
       [<ffffffff8109914d>] ? sched_clock_cpu+0xcd/0x110
       [<ffffffff81407a97>] ? cpuidle_idle_call+0xa7/0x140
       [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
       [<ffffffff814f714f>] ? start_secondary+0x22a/0x26d
      ---[ end trace cc917bd58c42f280 ]---
       ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en mlx4_core e1000e microcode serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
      
      Pid: 0, comm: swapper Tainted: G        W  ---------------    2.6.32-279.5.1.el6_lustre.g634f764.x86_64 #1 Supermicro X8DTT-H/X8DTT-H
      RIP: 0010:[<ffffffff8150057e>]  [<ffffffff8150057e>] _spin_lock+0xe/0x30
      RSP: 0018:ffff8800282e3860  EFLAGS: 00010002
      RAX: 0000000000010000 RBX: 000000000000adcc RCX: ffff880630129400
      RDX: 00c0000000000080 RSI: 0000000000000000 RDI: 0000000000000040
      RBP: ffff8800282e3860 R08: 0000000000000001 R09: 0000000003dcf503
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000286 R14: ffff88033fcd0340 R15: 0000000000009dc0
      FS:  0000000000000000(0000) GS:ffff8800282e0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 0000000000000040 CR3: 000000033052b000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 0, threadinfo ffff880637cfe000, task ffff880337e2a040)
      Stack:
       ffff8800282e38d0 ffffffff811645f3 ffff8800282e3920 ffffffff81476fc0
      <d> 0000000b282e38a0 0000000000000001 ffffea0014c1fc00 ffff880630129400
      <d> ffff8800282e38f0 0000000000000001 ffff880615472880 ffff880615472880
      Call Trace:
       <IRQ> 
       [<ffffffff811645f3>] kfree+0x2d3/0x320
       [<ffffffff81476fc0>] ? ip_queue_xmit+0x190/0x420
       [<ffffffff81430368>] skb_release_data+0xd8/0x110
       [<ffffffff8142fe9e>] __kfree_skb+0x1e/0xa0
       [<ffffffff81488f14>] tcp_ack+0x3b4/0x1280
       [<ffffffff8148bd0e>] ? tcp_transmit_skb+0x3fe/0x7b0
       [<ffffffff8150076b>] ? _spin_unlock_bh+0x1b/0x20
       [<ffffffffa08166c7>] ? ksocknal_read_callback+0x47/0x190 [ksocklnd]
       [<ffffffff8148a1bd>] tcp_rcv_established+0x3dd/0x800
       [<ffffffff814921b3>] tcp_v4_do_rcv+0x2e3/0x430
       [<ffffffff81493a2e>] tcp_v4_rcv+0x4fe/0x8d0
       [<ffffffff8142fec7>] ? __kfree_skb+0x47/0xa0
       [<ffffffff8147174d>] ip_local_deliver_finish+0xdd/0x2d0
       [<ffffffff814719d8>] ip_local_deliver+0x98/0xa0
       [<ffffffff81470e9d>] ip_rcv_finish+0x12d/0x440
       [<ffffffff81471425>] ip_rcv+0x275/0x350
       [<ffffffff8143ac2b>] __netif_receive_skb+0x49b/0x6f0
       [<ffffffff8143cea8>] netif_receive_skb+0x58/0x60
       [<ffffffff8143cfb0>] napi_skb_finish+0x50/0x70
      �
       [<ffffffff8143f4e9>] napi_gro_receive+0x39/0x50
       [<ffffffffa014220b>] e1000_receive_skb+0x5b/0x90 [e1000e]
       [<ffffffffa0145fa0>] e1000_clean_rx_irq+0x380/0x580 [e1000e]
       [<ffffffffa01447c5>] e1000_clean+0xb5/0x2c0 [e1000e]
       [<ffffffff8143f603>] net_rx_action+0x103/0x2f0
       [<ffffffff81073ec1>] __do_softirq+0xc1/0x1e0
       [<ffffffff810db930>] ? handle_IRQ_event+0x60/0x170
       [<ffffffff81073f1f>] ? __do_softirq+0x11f/0x1e0
       [<ffffffff8100c24c>] call_softirq+0x1c/0x30
       [<ffffffff8100de85>] do_softirq+0x65/0xa0
       [<ffffffff81073ca5>] irq_exit+0x85/0x90
       [<ffffffff81505f65>] do_IRQ+0x75/0xf0
       [<ffffffff8100ba53>] ret_from_intr+0x0/0x11
       <EOI> 
       [<ffffffff812cdd0e>] ? intel_idle+0xde/0x170
       [<ffffffff812cdcf1>] ? intel_idle+0xc1/0x170
       [<ffffffff81407a97>] cpuidle_idle_call+0xa7/0x140
       [<ffffffff81009e06>] cpu_idle+0xb6/0x110
       [<ffffffff814f714f>] start_secondary+0x22a/0x26d
      Code: e5 0f 1f 44 00 00 fa 66 0f 1f 44 00 00 f0 81 2f 00 00 00 01 74 05 e8 b2 e3 d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 <f0> 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 eb f5 
      RIP  [<ffffffff8150057e>] _spin_lock+0xe/0x30
       RSP <ffff8800282e3860>
      CR2: 0000000000000040
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: