[LU-8738] sanity test_255b: FAIL: Ladvise willread should use more memory than 76800 KiB Created: 20/Oct/16 Updated: 21/Nov/16 Resolved: 21/Nov/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Jian Yu | Assignee: | James Nunez (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
sanity test 255b failed as follows: == sanity test 255b: check 'lfs ladvise -a dontneed' ================================================= 17:21:04 (1476811264) 100+0 records in 100+0 records out 104857600 bytes (105 MB) copied, 0.778124 s, 135 MB/s CMD: trevis-41vm4 cat /proc/meminfo | grep ^MemTotal: Total memory: 1923480 KiB CMD: trevis-41vm4 sync && echo 3 > /proc/sys/vm/drop_caches CMD: trevis-41vm4 cat /proc/meminfo | grep ^Cached: Cache used before read: 72592 KiB CMD: trevis-41vm4 cat /proc/meminfo | grep ^Cached: Cache used after read: 120972 KiB CMD: trevis-41vm4 cat /proc/meminfo | grep ^Cached: Cache used after dontneed ladvise: 18572 KiB sanity test_255b: @@@@@@ FAIL: Ladvise willread should use more memory than 76800 KiB Maloo reports: |
| Comments |
| Comment by nasf (Inactive) [ 07/Nov/16 ] |
|
+1 on master: |
| Comment by Bob Glossman (Inactive) [ 13/Nov/16 ] |
|
another on master: |
| Comment by John Hammond [ 15/Nov/16 ] |
|
On master: https://testing.hpdd.intel.com/test_sets/37218620-ab04-11e6-a726-5254006e85c2 Note that ladvise willread used almost 76800 KiB. |
| Comment by Steve Guminski (Inactive) [ 16/Nov/16 ] |
|
Again on master: https://testing.hpdd.intel.com/test_sets/a6324fec-ab84-11e6-a76e-5254006e85c2 |
| Comment by Jian Yu [ 16/Nov/16 ] |
|
One more failure instance on master branch: |
| Comment by Steve Guminski (Inactive) [ 16/Nov/16 ] |
|
Another failure on master: https://testing.hpdd.intel.com/test_sets/8e908c38-ac1e-11e6-9116-5254006e85c2 |
| Comment by Andreas Dilger [ 17/Nov/16 ] |
|
This is causing a lot of test failures. Li Xi, could you please take a look. |
| Comment by James Nunez (Inactive) [ 17/Nov/16 ] |
|
Li Xi - For sanity test_255b, we are only measuring the total amount of cache and amount of cache used for ost1. Yet, the first thing we do is stripe the file across all OSTs; ‘lfs setstripe –c -1 -i 0 …’. To see the impact of caching on ost1 only, do we want to limit the file to a single ost, in particular ost1, meaning ‘lfs setstripe –c 1 -i 0 ...' ? A file on multiple OSTs (full striping) versus a file on a single OST could change amount of cache used by the willread hint on ost1 when you have more than one OST. Could this explain why this test passes some of the time and fails some of the time? In my testing, using a single striped file for this test succeeds every time and striping a file over 4 OSTs fails every time. |
| Comment by John Hammond [ 18/Nov/16 ] |
|
James, are the 4 OSTs on the same OSS? If you have a setup that reproduces this issue handy could you try add sync on the line after dd? |
| Comment by Bob Glossman (Inactive) [ 18/Nov/16 ] |
|
another on master: |
| Comment by James Nunez (Inactive) [ 18/Nov/16 ] |
|
John - I have two OSSs with two OSTs each. Since there is already a sync on ost1 after the write, I'll try the sync on the client. |
| Comment by James Nunez (Inactive) [ 18/Nov/16 ] |
|
This test still fails with sync after the write. |
| Comment by Gerrit Updater [ 18/Nov/16 ] |
|
James Nunez (james.a.nunez@intel.com) uploaded a new patch: http://review.whamcloud.com/23867 |
| Comment by Niu Yawei (Inactive) [ 21/Nov/16 ] |
|
This looks a must fail test, we'd try to land the fix asap. |
| Comment by Gerrit Updater [ 21/Nov/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/23867/ |
| Comment by Peter Jones [ 21/Nov/16 ] |
|
Landed for 2.9 |