Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
None
-
Lustre 2.3.0
-
None
-
3
-
4077
Description
This issue was created by maloo for yujian <yujian@whamcloud.com>
This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/7b8753ea-fcb6-11e1-b09c-52540035b04c.
The sub-test test_3 failed with the following error:
test failed to respond and timed out
Info required for matching: performance-sanity 3
Lustre Build: http://build.whamcloud.com/job/lustre-b2_3/17
Console log on the MDS (client-20):
22:21:56:Lustre: DEBUG MARKER: /usr/sbin/lctl mark ===== mdsrate-create-small.sh ### 2 NODES CREATE with 3 threads per client ### 22:21:56:Lustre: DEBUG MARKER: ===== mdsrate-create-small.sh 22:21:56:LustreError: 27339:0:(osd_iam_lfix.c:190:iam_lfix_init()) Wrong magic in node 74310 (#46): 0x0 != 0x1976 or wrong count: 0 (170) 22:21:56:BUG: unable to handle kernel paging request at 0000000900000000 22:21:56:IP: [<ffffffff811ad107>] __find_get_block_slow+0x87/0x130 22:21:56:PGD 0 22:21:56:Oops: 0000 [#1] SMP 22:21:56:last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map 22:21:56:CPU 2 22:21:56:Modules linked in: cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) lustre(U) lquota(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) jbd2 nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa igb mlx4_ib ib_mad ib_core mlx4_en mlx4_core microcode serio_raw sg i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: obdecho] 22:21:56: 22:21:56:Pid: 27339, comm: mdt00_000 Not tainted 2.6.32-279.5.1.el6_lustre.g634f764.x86_64 #1 Supermicro X8DTT/X8DTT 22:21:56:RIP: 0010:[<ffffffff811ad107>] [<ffffffff811ad107>] __find_get_block_slow+0x87/0x130 22:21:56:RSP: 0018:ffff8802da0234c0 EFLAGS: 00010282 22:21:56:RAX: 0000000900000000 RBX: ffffea000a74e210 RCX: ffff88033523e848 22:21:56:RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88032c7cf468 22:21:56:RBP: ffff8802da0234f0 R08: 0000000000000003 R09: ffffea000a74e218 22:21:56:------------[ cut here ]------------ 22:21:56:WARNING: at kernel/sched_fair.c:132 load_balance_next_fair+0x6a/0x80() (Not tainted) 22:21:56:Hardware name: X8DTT 22:21:56:Modules linked in: cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) lustre(U) lquota(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) jbd2 nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa igb mlx4_ib ib_mad ib_core mlx4_en mlx4_core microcode serio_raw sg i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: obdecho] 22:21:56:Pid: 0, comm: swapper Not tainted 2.6.32-279.5.1.el6_lustre.g634f764.x86_64 #1 22:21:56:Call Trace: 22:21:56: <IRQ> [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0 22:21:56: [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20 22:21:56: [<ffffffff8105f1ba>] ? load_balance_next_fair+0x6a/0x80 22:21:56: [<ffffffff8105fbc8>] ? load_balance_fair+0x208/0x2f0 22:21:56: [<ffffffff8106052b>] ? rebalance_domains+0x27b/0x5a0 22:21:56: [<ffffffff810a1dba>] ? tick_program_event+0x2a/0x30 22:21:56: [<ffffffff8102b4db>] ? lapic_timer_broadcast+0x1b/0x20 22:21:56: [<ffffffff8106089c>] ? run_rebalance_domains+0x4c/0x160 22:21:56: [<ffffffff81073ec1>] ? __do_softirq+0xc1/0x1e0 22:21:56: [<ffffffff810db930>] ? handle_IRQ_event+0x60/0x170 22:21:56: [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30 22:21:56: [<ffffffff8100de85>] ? do_softirq+0x65/0xa0 22:21:56: [<ffffffff81073ca5>] ? irq_exit+0x85/0x90 22:21:56: [<ffffffff81505f65>] ? do_IRQ+0x75/0xf0 22:21:56: [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11 22:21:56: <EOI> [<ffffffff812cdd0e>] ? intel_idle+0xde/0x170 22:21:56: [<ffffffff812cdcf1>] ? intel_idle+0xc1/0x170 22:21:56: [<ffffffff8109914d>] ? sched_clock_cpu+0xcd/0x110 22:21:56: [<ffffffff81407a97>] ? cpuidle_idle_call+0xa7/0x140 22:21:56: [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110 22:21:56: [<ffffffff814e47aa>] ? rest_init+0x7a/0x80 22:21:56: [<ffffffff81c21f7b>] ? start_kernel+0x424/0x430 22:21:56: [<ffffffff81c2133a>] ? x86_64_start_reservations+0x125/0x129 22:21:56: [<ffffffff81c21438>] ? x86_64_start_kernel+0xfa/0x109 22:21:56:---[ end trace 17f4d327b142744d ]---
Attachments
Issue Links
- duplicates
-
LU-1881 sanity test 116 soft lockup
- Resolved