[LU-3752] sanity-quota test_18: expect 104857600, got 42991616. Verifying file failed! Created: 13/Aug/13 Updated: 17/Jul/17 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.1, Lustre 2.5.0, Lustre 2.6.0, Lustre 2.5.1, Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Oleg Drokin |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | yuc2 | ||
| Severity: | 3 |
| Rank (Obsolete): | 9677 |
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/73a518da-029e-11e3-b384-52540035b04c. The sub-test test_18 failed with the following error:
Info required for matching: sanity-quota 18 |
| Comments |
| Comment by Jian Yu [ 14/Aug/13 ] |
|
I just found that this is a regression introduced by the patch in build #28 on Lustre b2_4 branch. Before build #28, sanity-quota test 18 always passed on Lustre b2_4 branch. Since build #28, there were 6 full test runs on build #29 against the RHEL6 and SLES11SP2 clients, 2 test runs hit sanity-quota test 18 failure: Failed test runs: Passed test runs: On master branch, the test passed on build #1582, and failed on build #1591. The builds between them were not tested. By looking over the patches on these builds and those patches in b2_4 build #28, the following ones are intersections:
|
| Comment by Peter Jones [ 14/Aug/13 ] |
|
Oleg Can you please try to further identify the cause of this regression? Thanks Peter |
| Comment by Oleg Drokin [ 14/Aug/13 ] |
|
A very suspicious common pattern is observed in those test results. All successful test runs have: running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
[dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d0.sanity-quota/d18/f.sanity-quota.18] [count=100] [oflag=direct]
CMD: client-20-ib sync; sync; sync
Filesystem 1K-blocks Used Available Use% Mounted on
client-20-ib@o2ib:/lustre
14222720 13440 14150272 1% /mnt/lustre
All failing runs have: Write 100M (directio) ... running as uid/gid/euid/egid 60000/60000/60000/60000, groups: [dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d0.sanity-quota/d18/f.sanity-quota.18] [count=100] [oflag=direct] CMD: client-26vm7 sync; sync; sync Filesystem 1K-blocks Used Available Use% Mounted on client-26vm7@tcp:/lustre 1464484 264460 1118928 20% /mnt/lustre So, my question is - why newer testruns (that are failing) have 10x less disk space? I bet this is why the test is dying with out of space error now - because striping is also not used and so with previously present files there's just not enough space in the new scheme of things. |
| Comment by Jian Yu [ 16/Aug/13 ] |
|
For failed test runs: MDSSIZE=1939865 OSTSIZE=223196 For passed test runs: MDSSIZE=2097152 OSTSIZE=2097152 The real failure was: dd: writing `/mnt/lustre/d0.sanity-quota/d18/f.sanity-quota.18': No space left on device We need improve the test script to check available space. |
| Comment by Bob Glossman (Inactive) [ 16/Aug/13 ] |
|
space check added Still leaves open the question of why we're running out of space in the first place. |
| Comment by Sarah Liu [ 04/Dec/13 ] |
|
hit this issue in lustre-master build # 1784 https://maloo.whamcloud.com/test_sets/67469a2c-5bbe-11e3-8d79-52540035b04c |
| Comment by Jian Yu [ 17/Jan/14 ] |
|
Lustre client build: http://build.whamcloud.com/job/lustre-b2_4/70/ (2.4.2) The same failure occurred: |
| Comment by Sarah Liu [ 20/Mar/14 ] |
|
Hit this failure in lustre-master tag-2.5.57(build # 1945) testing for zfs: In the previous build 1944 and 1943, this test passed: |
| Comment by Sarah Liu [ 20/Jan/16 ] |
|
hit this on current master build#3305 RHEL6.7 zfs |
| Comment by Saurabh Tandan (Inactive) [ 04/Feb/16 ] |
|
Encountered another instance for FULL - EL6.7 Server/EL6.7 Client - ZFS , master , build# 3314 Another instance on master for FULL - EL7.1 Server/EL7.1 Client - ZFS, build# 3314 |
| Comment by Saurabh Tandan (Inactive) [ 10/Feb/16 ] |
|
Another instance found for Full tag 2.7.66 - EL6.7 Server/EL6.7 Client - ZFS, build# 3314 Another instance found for Full tag 2.7.66 -EL7.1 Server/EL7.1 Client - ZFS, build# 3314 |