[LU-8565] sanity test 255a fails with ‘Speedup with willread is less than X%, got Y%’ Created: 29/Aug/16 Updated: 09/Feb/17 Resolved: 08/Sep/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | James Nunez (Inactive) | Assignee: | Li Xi (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
autotest |
||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
sanity test 255a fails with sanity test_255a: @@@@@@ FAIL: Speedup with willread is less than 88%, got 7% where the percentage values differ for each failure. From one of the test failures, we can see that using the WILLREAD advice does not help with read and, for this test, does not out perform the benefits of reading from cache Iter 1/10: cache speedup: 98% Iter 1/10: ladvise speedup: -7% Iter 3/10: cache speedup: 131% Iter 3/10: ladvise speedup: 8% Iter 8/10: cache speedup: 108% Iter 8/10: ladvise speedup: 0% For this test, we are comparing the speed up gained with the WILLREAD advice versus (half) the benefit of reading from cache: local lowest_speedup=$((average_cache / 2)) [ $average_ladvise -gt $lowest_speedup ] || error "Speedup with willread is less than $lowest_speedup%,"\ "got $average_ladvise%" So, the expected speed up of the willread advice to ladvise is not as great as expected. The WILLREAD hint was added with patch http://review.whamcloud.com/#/c/12458 Logs for recent failures are at Info required for matching: sanity 255a |
| Comments |
| Comment by Evan D. Chen (Inactive) [ 30/Aug/16 ] |
|
Li Xi, can you help to take a look of this ticket? |
| Comment by Oleg Drokin [ 30/Aug/16 ] |
|
So it's clear the test needs to be revisited. |
| Comment by Li Xi (Inactive) [ 07/Sep/16 ] |
|
I agree that the test should be skipped right now. It seems the performance improvement of "ladvise willread" is not so high as expected in some environments. |
| Comment by Andreas Dilger [ 07/Sep/16 ] |
|
Li Xi, can you please submit a patch to change test_255a to not fail if the performance isn't as expected when running in a VM, like test_248: test_248() {
local my_error=error
:
:
# This test case is time sensitive and Maloo uses KVM to run autotest.
# Therefore the complete time of I/O task is unreliable and depends on
# the workload on the host machine when the task is running.
local virt=$(running_in_vm)
[ -n "$virt" ] && echo "running in VM '$virt', ignore error" &&
my_error="error_ignore env=$virt"
:
:
# verify that fast read is 4 times faster for cache read
[ $(bc <<< "4 * $t_fast < $t_slow") -eq 1 ] ||
$my_error "fast read was not 4 times faster: $t_fast vs $t_slow"
This is causing failures about once a day on other patches. |
| Comment by Gerrit Updater [ 08/Sep/16 ] |
|
Gu Zheng (gzheng@ddn.com) uploaded a new patch: http://review.whamcloud.com/22375 |
| Comment by Gerrit Updater [ 08/Sep/16 ] |
|
Andreas Dilger (andreas.dilger@intel.com) merged in patch http://review.whamcloud.com/22375/ |
| Comment by Andreas Dilger [ 08/Sep/16 ] |
|
Patch landed for 2.9.0. Other patches would need to rebase to get this fix, but it is also unlikely that they would hit the same failure twice if retesting. There may still be a few failures in the next few days before patches are updated to include this fix. |
| Comment by Gerrit Updater [ 16/Jan/17 ] |
|
Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/24907 |
| Comment by Andreas Dilger [ 09/Feb/17 ] |
|
LU-9069 is for test failures on real hardware. |