[LU-9069] ZFS sanity: FAIL: test_255a Speedup with willread is less than XX%, got YY% Created: 31/Jan/17 Updated: 01/Dec/17 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Casper | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | always_except, zfs | ||
| Environment: |
onyx-64 thru 67 |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
sanity: FAIL: test_255a Speedup with willread is less than XX%, got YY% Test is expecting willreads to be faster than cache reads, but willreads are slower: From test_log: Iter 1/10: cache speedup: 228% Iter 1/10: ladvise speedup: 24% Iter 2/10: cache speedup: 147% Iter 2/10: ladvise speedup: 20% Iter 3/10: cache speedup: 170% Iter 3/10: ladvise speedup: 10% Related to Info required for matching: sanity 255a |
| Comments |
| Comment by Andreas Dilger [ 03/Mar/17 ] |
|
It turns out that all of these real-world test failures are only failing with ZFS backends. That implies some problem with ladvise not prefetching the pages properly for ZFS, since the cached reads run much faster but ladvise reads sometimes do not run faster with ZFS. Looking at the output more closely, it appears that the cases where ZFS is passing are only because the cached speedup is below the threshold: Iter 1/10: reading without willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 1/10: Reading again without willread hint Iter 1/10: reading with willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 1/10: cache speedup: 6.79256/6.53006=4% Iter 1/10: ladvise speedup: 6.59651/6.53006=1% Iter 2/10: reading without willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 2/10: Reading again without willread hint Iter 2/10: reading with willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 2/10: cache speedup: 7.24601/6.77954=6% Iter 2/10: ladvise speedup: 6.651/6.77954=-1% Iter 3/10: reading without willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 3/10: Reading again without willread hint Iter 3/10: reading with willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 3/10: cache speedup: 6.82836/6.4925=5% Iter 3/10: ladvise speedup: 6.39501/6.4925=-1% Iter 4/10: reading without willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 4/10: Reading again without willread hint Iter 4/10: reading with willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 4/10: cache speedup: 6.80656/6.68438=1% Iter 4/10: ladvise speedup: 6.52108/6.68438=-2% Iter 5/10: reading without willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 5/10: Reading again without willread hint Iter 5/10: reading with willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 5/10: cache speedup: 7.23196/6.44395=12% Iter 5/10: ladvise speedup: 6.67935/6.44395=3% Iter 6/10: reading without willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 6/10: Reading again without willread hint Iter 6/10: reading with willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 6/10: cache speedup: 6.90188/6.73113=2% Iter 6/10: ladvise speedup: 6.61056/6.73113=-1% Iter 7/10: reading without willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 7/10: Reading again without willread hint Iter 7/10: reading with willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 7/10: cache speedup: 6.93889/6.67545=3% Iter 7/10: ladvise speedup: 6.4859/6.67545=-2% Iter 8/10: reading without willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 8/10: Reading again without willread hint Iter 8/10: reading with willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 8/10: cache speedup: 7.03663/6.86692=2% Iter 8/10: ladvise speedup: 6.60757/6.86692=-3% Iter 9/10: reading without willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 9/10: Reading again without willread hint Iter 9/10: reading with willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 9/10: cache speedup: 7.18227/6.64533=8% Iter 9/10: ladvise speedup: 6.56245/6.64533=-1% Iter 10/10: reading without willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 10/10: Reading again without willread hint Iter 10/10: reading with willread hint CMD: trevis-34vm4 echo 3 > /proc/sys/vm/drop_caches Iter 10/10: cache speedup: 6.96441/6.69985=3% Iter 10/10: ladvise speedup: 6.46659/6.69985=-3% Speedup with cache is less than 20% (4%), skipping check of speedup with willread: -1% So it looks like the real problem is that "echo 3 > /proc/sys/vm/drop_caches" doesn't actually flush the cache for ZFS servers, so the "uncached speed" test is invalid. What is needed is to implement the "dontneed" ladvice for ZFS so that the test can use it to flush the server ARC cache for this file and run a valid test. |
| Comment by Andreas Dilger [ 03/Mar/17 ] |
|
Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/24907 |
| Comment by Gerrit Updater [ 01/May/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/24907/ |