[LU-7612] recovery-small tests 110a, 110b, 110c, 110d, 110e, 110f fail with 'lfs mkdir failed' Created: 27/Dec/15 Updated: 22/Jul/18 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | dne | ||
| Environment: |
autotest review-dne-part-1 |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
recovery-small test_110a, test_110b, test_110c, test_110d, test_110e, and test_110f fail with 'lfs mkdir failed' From the test_log, we see: == recovery-small test 110a: create remote directory: drop client req == 18:23:45 (1451154225) CMD: shadow-20vm8 lctl set_param fail_loc=0x123 fail_loc=0x123 CMD: shadow-20vm5.shadow.whamcloud.com /usr/bin/lfs mkdir -i 1 -c2 /mnt/lustre/d110a.recovery-small/remote_dir error on LL_IOC_LMV_SETSTRIPE '/mnt/lustre/d110a.recovery-small/remote_dir' (3): Bad address error: mkdir: create stripe dir '/mnt/lustre/d110a.recovery-small/remote_dir' failed CMD: shadow-20vm8 lctl set_param fail_loc=0 fail_loc=0 recovery-small test_110a: @@@@@@ FAIL: lfs mkdir failed There’s nothing interesting in the console logs on any of the nodes. Tests 110g, 110h, 110i and 110j fail when the other 110 tests fail and with a similar error message in the test logs: error on LL_IOC_LMV_SETSTRIPE '/mnt/lustre/d110a.recovery-small/remote_dir' (3): Bad address error: mkdir: create stripe dir '/mnt/lustre/d110a.recovery-small/remote_dir' failed These tests has been failing since October 11, 2015. Failed test logs at: |
| Comments |
| Comment by James Nunez (Inactive) [ 27/Dec/15 ] |
|
There are a few other test suites for the same sessions above that fail with the LL_IOC_LMV_SETSTRIPE 'Bad address' failures: Upon looking at all the patches that experience this problem, most are for patch #16785 for ticket LU-2533. |
| Comment by Di Wang [ 29/Dec/15 ] |
|
I just checked these failures, it seems they are all from patch http://review.whamcloud.com/#/c/16785/ and http://review.whamcloud.com/#/c/16969 (already fix the problem in the most recent patch) It is probably these patches problem, instead of master problem, so let's close this ticket? |
| Comment by James Nunez (Inactive) [ 30/Dec/15 ] |
|
Di - I agree that most of these failures are due to patches 16785 and 16969, but there are two cases that these tests failed with this error: What do you think about these failures? |
| Comment by James Nunez (Inactive) [ 30/Dec/15 ] |
|
And the most recent results from |
| Comment by Di Wang [ 04/Jan/16 ] |
|
It seems the newest run already pass the test, so I guess the update patch already fixed the problem. And patch 17199 is based on patch 16969. |
| Comment by Mikhail Pershin [ 22/Jul/18 ] |
|
this issue still happens time to time, about 11 times in this year the latest one: |