Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Andreas Dilger <andreas.dilger@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/974a9cae-5b77-11e5-bdf5-5254006e85c2.
The sub-test test_70c timed out with the following error in the OSS console log:
23:32:48:LustreError: 168-f: BAD WRITE CHECKSUM: lustre-OST0001 from 12345-10.1.4.189@tcp inode [0x20000560a:0x3254:0x0] object 0x0:8236 extent [2097152-3143167]: client csum a73c8811, server csum 9be5c892 23:32:48:general protection fault: 0000 [#1] SMP 23:32:48:last sysfs file: /sys/devices/system/cpu/online 23:32:48:CPU 0 23:32:48:Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) osd_zfs(U) lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic libcfs(U) nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 zfs(P)(U) zcommon(P)(U) znvpair(P)(U) spl(U) zlib_deflate zavl(P)(U) zunicode(P)(U) microcode serio_raw virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk pata_acpi ata_generic ata_piix virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] 23:32:48: 23:32:48:Pid: 4821, comm: socknal_sd00_01 Tainted: P -- ------------ 2.6.32-573.3.1.el6_lustre.g43c6468.x86_64 #1 Red Hat KVM 23:32:48:RIP: 0010:[<ffffffff8113e229>] [<ffffffff8113e229>] put_page+0x9/0x40 23:32:48:RSP: 0018:ffff88003710f900 EFLAGS: 00010206 23:32:48:RAX: 0000000000000030 RBX: 0000000000000001 RCX: ffff880068090000 23:32:48:RDX: ffff880068090640 RSI: ffff88006809060c RDI: 00f8100c00000003 23:32:48:RBP: ffff88003710f900 R08: 00f80ed400000003 R09: 00f80e1c00000190 23:32:48:R10: ffff880077cfe840 R11: ffff880077cfe8f0 R12: ffff88006d4950c0 23:32:48:R13: ffff88006d4950f8 R14: ffff880077cfec9c R15: 0000000000000000 23:32:48:FS: 0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 23:32:48:CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 23:32:48:CR2: 00007fce6bd77000 CR3: 000000007b976000 CR4: 00000000000006f0 23:32:48:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 23:32:48:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 23:32:48:Process socknal_sd00_01 (pid: 4821, threadinfo ffff88003710c000, task ffff880037a31520) 23:32:48:Stack: 23:32:48: ffff88003710f920 ffffffff8145e84f ffff88006d4950c0 0000000000000000 23:32:48:<d> ffff88003710f940 ffffffff8145e3de ffff880077cfec9c ffff88006d4950c0 23:32:48:<d> ffff88003710fa70 ffffffff814b7326 ffff88003710f970 ffff8800378c1080 23:32:48:Call Trace: 23:32:48: [<ffffffff8145e84f>] skb_release_data+0x7f/0x110 23:32:48: [<ffffffff8145e3de>] __kfree_skb+0x1e/0xa0 23:32:48: [<ffffffff814b7326>] tcp_recvmsg+0xfe6/0x10f0 23:32:48: [<ffffffff814d812a>] inet_recvmsg+0x5a/0x90 23:32:48: [<ffffffff814584d3>] sock_recvmsg+0x133/0x160 23:32:48: [<ffffffff81458544>] kernel_recvmsg+0x44/0x60 23:32:48: [<ffffffffa0d60965>] ksocknal_lib_recv_kiov+0x165/0x3d0 [ksocklnd] 23:32:48: [<ffffffffa0d5a07f>] ksocknal_process_receive+0x2af/0xed0 [ksocklnd] 23:32:48: [<ffffffffa0d5c62b>] ksocknal_scheduler+0x12b/0x1390 [ksocklnd] 23:32:48: [<ffffffff810a101e>] kthread+0x9e/0xc0
There are also other types of memory corruption being seen in other failures:
https://testing.hpdd.intel.com/test_sets/a2f995dc-59ab-11e5-aac5-5254006e85c2
17:39:05:WARNING: at lib/list_debug.c:48 list_del+0x6e/0xa0() (Tainted: P -- ------------ ) 17:39:05:Hardware name: KVM 17:39:05:list_del corruption. prev->next should be ffff88006b844000, but was 00040010042802a8 17:39:05:Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) osd_zfs(U) lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic libcfs(U) nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 zfs(P)(U) zcommon(P)(U) znvpair(P)(U) spl(U) zlib_deflate zavl(P)(U) zunicode(P)(U) microcode serio_raw virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] 17:39:05:Pid: 11, comm: events/0 Tainted: P -- ------------ 2.6.32-573.3.1.el6_lustre.gde57418.x86_64 #1 17:39:05:Call Trace: 17:39:05: [<ffffffff81077491>] ? warn_slowpath_common+0x91/0xe0 17:39:05: [<ffffffff81077596>] ? warn_slowpath_fmt+0x46/0x60 17:39:05: [<ffffffff812a40ae>] ? list_del+0x6e/0xa0 17:39:05: [<ffffffff811796f8>] ? free_block+0xc8/0x170 17:39:05: [<ffffffff811799d1>] ? drain_array+0xc1/0x100 17:39:05: [<ffffffff8117a8be>] ? cache_reap+0x8e/0x250 17:39:05: [<ffffffff8117a830>] ? cache_reap+0x0/0x250 17:39:05: [<ffffffff8109a7d0>] ? worker_thread+0x170/0x2a0 17:39:05: [<ffffffff810a14b0>] ? autoremove_wake_function+0x0/0x40 17:39:05: [<ffffffff8109a660>] ? worker_thread+0x0/0x2a0 17:39:05: [<ffffffff810a101e>] ? kthread+0x9e/0xc0
Info required for matching: replay-single 70c