Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
None
-
Lustre 2.1.3
-
None
-
Hyperion RHEL6 servers/clients
-
3
-
6351
Description
simul, running test #2 - file stat, shared mode
Clients having repeated soft CPU lockup.
Aug 11 10:45:27 ehyperion686 kernel: BUG: soft lockup - CPU#0 stuck for 67s! [simul:24453]
Aug 11 10:45:27 ehyperion686 kernel: BUG: soft lockup - CPU#0 stuck for 67s! [simul:24453]
Aug 11 10:45:27 ehyperion686 kernel: Modules linked in: lmv(U) mgc(U) lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa mlx4_ib ib_mad ib_core dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm dcdbas i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc mlx4_en mlx4_core e1000e [last unloaded: cpufreq_ondemand]
Aug 11 10:45:27 ehyperion686 kernel: CPU 0
Aug 11 10:45:27 ehyperion686 kernel: Modules linked in: lmv(U) mgc(U) lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa mlx4_ib ib_mad ib_core dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm dcdbas i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc mlx4_en mlx4_core e1000e [last unloaded: cpufreq_ondemand]
Aug 11 10:45:27 ehyperion686 kernel:
Aug 11 10:45:27 ehyperion686 kernel: Pid: 24453, comm: simul Not tainted 2.6.32-279.2.1.el6.x86_64 #1 Dell XS23-TY /XS23-TY
Aug 11 10:45:27 ehyperion686 kernel: RIP: 0010:[<ffffffff8150024e>] [<ffffffff8150024e>] _spin_lock+0x1e/0x30
Aug 11 10:45:27 ehyperion686 kernel: RSP: 0018:ffff880028203d90 EFLAGS: 00000297
Aug 11 10:45:27 ehyperion686 kernel: RAX: 000000000000ad35 RBX: ffff880028203d90 RCX: 0000000000001fa4
Aug 11 10:45:27 ehyperion686 kernel: RDX: 000000000000ad34 RSI: 0000000000000000 RDI: ffff8801a60486e8
Aug 11 10:45:27 ehyperion686 kernel: RBP: ffffffff8100bc13 R08: 8000000000000000 R09: 00004f07ef0d6d80
Aug 11 10:45:27 ehyperion686 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff880028203d10
Aug 11 10:45:27 ehyperion686 kernel: R13: ffffffff81aa5700 R14: ffff8801a60486e0 R15: ffffffff81505d0b
Aug 11 10:45:27 ehyperion686 kernel: FS: 00002aaaafc522c0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
Aug 11 10:45:27 ehyperion686 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug 11 10:45:27 ehyperion686 kernel: CR2: 00002aaab10fe044 CR3: 0000000179036000 CR4: 00000000000006f0
Aug 11 10:45:27 ehyperion686 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 11 10:45:27 ehyperion686 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug 11 10:45:27 ehyperion686 kernel: Process simul (pid: 24453, threadinfo ffff88033ac38000, task ffff88032e585500)
Aug 11 10:45:27 ehyperion686 kernel: Stack:
Aug 11 10:45:27 ehyperion686 kernel: ffff880028203dc0 ffffffffa026d120 0000000000000230 0000000000000000
Aug 11 10:45:27 ehyperion686 kernel: <d> ffffffff81aa5780 ffff88032dcfb480 ffff880028203e10 ffffffffa05860ca
Aug 11 10:45:27 ehyperion686 kernel: <d> ffff880028203e30 ffffffff810e0d44 0000000000000100 0000000000000008
Aug 11 10:45:27 ehyperion686 kernel: Call Trace:
Aug 11 10:45:27 ehyperion686 kernel: <IRQ>
11 10:45:27 ehyperion686 kernel: [<ffffffffa026d120>] ? lprocfs_counter_sub+0xc0/0x100 [lvfs]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa05860ca>] ? ldlm_lock_free+0x3a/0x110 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff810e0d44>] ? rcu_process_dyntick+0x114/0x120
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa04b9e7f>] ? class_handle_free_cb+0x2f/0x110 [obdclass]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff810e0dce>] ? force_quiescent_state+0x7e/0x1a0
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff810e12fd>] ? __rcu_process_callbacks+0x10d/0x330
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff810e154b>] ? rcu_process_callbacks+0x2b/0x50
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff81073ec1>] ? __do_softirq+0xc1/0x1e0
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff81096c50>] ? hrtimer_interrupt+0x140/0x250
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff81073ca5>] ? irq_exit+0x85/0x90
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff81505d10>] ? smp_apic_timer_interrupt+0x70/0x9b
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20
Aug 11 10:45:27 ehyperion686 kernel: <EOI>
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa04eec60>] ? check_and_discard_cb+0x0/0x140 [obdclass]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa026d140>] ? lprocfs_counter_sub+0xe0/0x100 [lvfs]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa026d120>] ? lprocfs_counter_sub+0xc0/0x100 [lvfs]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa07dcc81>] ? lov_io_fini+0x211/0x3d0 [lov]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa04f0548>] ? cl_io_init0+0xa8/0x190 [obdclass]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa04f35a6>] ? cl_io_fini+0x76/0x160 [obdclass]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa04efdd4>] ? cl_2queue_init+0x34/0xe0 [obdclass]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa04eabeb>] ? cl_lock_page_out+0xeb/0x3b0 [obdclass]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa04e3686>] ? cl_env_peek+0x36/0x110 [obdclass]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa040a0b7>] ? cfs_hash_bd_lookup_intent+0x37/0x120 [libcfs]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa04eec60>] ? check_and_discard_cb+0x0/0x140 [obdclass]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa0779f0f>] ? osc_lock_flush+0x4f/0x90 [osc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa0779faa>] ? osc_lock_cancel+0x5a/0x170 [osc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa04e9a65>] ? cl_lock_cancel0+0x75/0x130 [obdclass]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa04ea5ab>] ? cl_lock_cancel+0x11b/0x120 [obdclass]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa077b538>] ? osc_ldlm_blocking_ast+0x148/0x300 [osc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa0588be3>] ? ldlm_cancel_callback+0x63/0xf0 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa05a3421>] ? ldlm_cli_cancel_local+0x71/0x350 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa05a66af>] ? ldlm_cli_cancel_list_local+0xef/0x230 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa058db72>] ? ldlm_res_hop_get_locked+0x12/0x20 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa05a6958>] ? ldlm_cancel_resource_local+0x168/0x280 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa0590044>] ? ldlm_resource_get+0xa4/0x6e0 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa07655ff>] ? osc_destroy+0x11f/0x8f0 [osc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa07baa5d>] ? lov_set_add_req+0x2d/0x50 [lov]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa07bf9c2>] ? lov_prep_destroy_set+0x372/0xa40 [lov]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa07aac37>] ? lov_destroy+0x357/0xcd0 [lov]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa05d0465>] ? lustre_msg_buf+0x85/0x90 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa07d00c3>] ? lsm_unpackmd_v1+0xf3/0x200 [lov]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa07b5474>] ? lov_alloc_memmd+0x54/0x360 [lov]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa05fc611>] ? req_capsule_set_size+0x71/0x1b0 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa085af51>] ? ll_objects_destroy+0x571/0x1a90 [lustre]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa05d0465>] ? lustre_msg_buf+0x85/0x90 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa05fcc4b>] ? __req_capsule_get+0x14b/0x6b0 [ptlrpc]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa0855546>] ? ll_update_times+0x26/0x110 [lustre]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffffa085cf7f>] ? ll_unlink+0x35f/0x780 [lustre]
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff81188d3f>] ? vfs_unlink+0x9f/0xe0
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff81187a8a>] ? lookup_hash+0x3a/0x50
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff8118b273>] ? do_unlinkat+0x183/0x1c0
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff81180786>] ? sys_newstat+0x36/0x50
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff8100c5b5>] ? math_state_restore+0x45/0x60
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff8118b2c6>] ? sys_unlink+0x16/0x20
Aug 11 10:45:27 ehyperion686 kernel: [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
Aug 11 10:45:27 ehyperion686 kernel: Code: 00 00 00 01 74 05 e8 e2 e3 d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89
Attachments
Issue Links
- Trackbacks
-
Lustre 2.1.3 release testing tracker
Lustre 2.1.3 RC1 Tag: v213RC1 Build: