[LU-9097] sanity test_253 test_255: ZFS list corruption Created: 09/Feb/17 Updated: 17/Apr/17 Resolved: 17/Apr/17 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.10.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Alex Zhuravlev |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
This issue was created by maloo for Andreas Dilger <andreas.dilger@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/937abbdc-e5e0-11e6-978f-5254006e85c2. The sub-test test_253 timed out. Looking at the console log on the server, it appears to be ZFS list corruption: 00:50:58:[ 4422.775971] WARNING: at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0() 00:50:58:[ 4422.775971] list_del corruption. prev->next should be ffffc900031a3010, but was (null) 00:50:58:[ 4422.795082] CPU: 0 PID: 32 Comm: kswapd0 Tainted: P OE ------------ 3.10.0-514.2.2.el7_lustre.x86_64 #1 00:50:58:[ 4422.815142] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 00:50:58:[ 4422.815142] Call Trace: 00:50:58:[ 4422.815142] [<ffffffff81686318>] dump_stack+0x19/0x1b 00:50:58:[ 4422.815142] [<ffffffff81085940>] warn_slowpath_common+0x70/0xb0 00:50:58:[ 4422.815142] [<ffffffff810859dc>] warn_slowpath_fmt+0x5c/0x80 00:50:58:[ 4422.815142] [<ffffffff81333301>] __list_del_entry+0xa1/0xd0 00:50:58:[ 4422.815142] [<ffffffff8133333d>] list_del+0xd/0x30 00:50:58:[ 4422.815142] [<ffffffffa065cf1d>] __spl_cache_flush+0xed/0x150 [spl] 00:50:58:[ 4422.815142] [<ffffffffa065d046>] spl_cache_flush+0x36/0x50 [spl] 00:50:58:[ 4422.815142] [<ffffffffa065d71f>] spl_kmem_cache_reap_now+0x10f/0x120 [spl] 00:50:58:[ 4422.815142] [<ffffffffa070b3c9>] arc_kmem_reap_now+0x79/0xe0 [zfs] 00:50:58:[ 4422.815142] [<ffffffffa0710bb7>] arc_shrinker_func+0x97/0x130 [zfs] 00:50:58:[ 4422.815142] [<ffffffff81194213>] shrink_slab+0x163/0x330 00:50:58:[ 4422.815142] [<ffffffff811f5361>] ? vmpressure+0x21/0x90 00:50:58:[ 4422.815142] [<ffffffff81198001>] balance_pgdat+0x4b1/0x5e0 00:50:58:[ 4422.815142] [<ffffffff811982a3>] kswapd+0x173/0x450 Please provide additional information about the failure here. Info required for matching: sanity 253 |
| Comments |
| Comment by Joseph Gmitter (Inactive) [ 14/Feb/17 ] |
|
Hi Alex, Can you please look into this one? Thanks. |
| Comment by Andreas Dilger [ 14/Feb/17 ] |
|
Alex, can you please look into this. It needs to be tracked back to when it started happening, whether it is one of your recent patch landings, or related to the update to a newer ZFS release, or is some kind of random memory corruption. |
| Comment by Alex Zhuravlev [ 17/Apr/17 ] |
|
this is a duplicate of LU-9110 |