[LU-4419] Test failure on test suite recovery-small, subtest test_110a Created: 29/Dec/13 Updated: 22/Jul/18 Resolved: 22/Jul/18 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Mikhail Pershin |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 12134 |
| Description |
|
This issue was created by maloo for nasf <fan.yong@intel.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/e0c22550-6db7-11e3-a191-52540035b04c. Slave MDT cannot allocate super-sequence: 07:41:38:Lustre: DEBUG MARKER: == recovery-small test 110a: create remote directory: drop client req == 07:40:37 (1387986037) |
| Comments |
| Comment by nasf (Inactive) [ 29/Dec/13 ] |
|
We have hit the failure several times: https://maloo.whamcloud.com/test_sets/e0c22550-6db7-11e3-a191-52540035b04c |
| Comment by Di Wang [ 29/Dec/13 ] |
|
Hmm, I am not sure this is a new bug or only exists in your or Mike's patch series, since I do not see it exists in other's patch. Please correct me, if I am wrong. Are your patches still dependent? |
| Comment by nasf (Inactive) [ 29/Dec/13 ] |
|
I cannot search the test results history because of Maloo issues. The first known failure instance was found in Mike's patch. But I do not think it is special issue in such patch, but more like general master bug. Because his patch does not touch MDT/FID stack. Current, LFSCK patches still depends on Mike's patch. |
| Comment by Di Wang [ 30/Dec/13 ] |
|
oh, it is not about MDT/FID stack. The failure is because the connection is somehow broken between other MDTs/OSTs to MDT0, which cause these target can not allocate the new FID sequence from MDT0. Hmm if your patch still depends on Mike's patch, it is probably Mike's patch problem, since I never saw this problem in current master and even in the run of other people's patch. |
| Comment by Sarah Liu [ 30/Dec/13 ] |
|
another instance: https://maloo.whamcloud.com/test_sets/1e071d22-706e-11e3-9fe0-52540035b04c |
| Comment by Mikhail Pershin [ 05/Jan/14 ] |
|
probably I've found the source of problem, let's wait for the latest patch test results, http://review.whamcloud.com/#/c/7383/ |
| Comment by Mikhail Pershin [ 07/Jan/14 ] |
|
https://maloo.whamcloud.com/test_sessions/ca3dea64-7751-11e3-943d-52540035b04c Now it works as expected. The problem was the lost chunk of code with OBD_FAIL_CHECK needed for tests. |
| Comment by John Hammond [ 01/Aug/14 ] |
|
Another instance https://testing.hpdd.intel.com/test_sets/2e799af4-1942-11e4-8c4a-5254006e85c2. 22:16:47:Lustre: DEBUG MARKER: == recovery-small test 110a: create remote directory: drop client req == 22:14:17 (1406870057) |
| Comment by James Nunez (Inactive) [ 02/Mar/15 ] |
|
Another instance on 2.7.0-RC2. Logs at https://testing.hpdd.intel.com/test_sets/20cb4134-bf80-11e4-881f-5254006e85c2 Client test log: == recovery-small test 110a: create remote directory: drop client req == 08:02:47 (1425139367) CMD: onyx-41vm3 lctl set_param fail_loc=0x123 fail_loc=0x123 CMD: onyx-41vm6.onyx.hpdd.intel.com /usr/bin/lfs mkdir -i 1 -c2 /mnt/lustre/d110a.recovery-small/remote_dir error on LL_IOC_LMV_SETSTRIPE '/mnt/lustre/d110a.recovery-small/remote_dir' (3): Input/output error error: mkdir: create stripe dir '/mnt/lustre/d110a.recovery-small/remote_dir' failed CMD: onyx-41vm3 lctl set_param fail_loc=0 MDT same as the log John posted above. |
| Comment by James Nunez (Inactive) [ 26/May/15 ] |
|
This test is still failing occasionally. Recent failures are: |
| Comment by James Nunez (Inactive) [ 26/May/15 ] |
|
Mike, |
| Comment by Mikhail Pershin [ 22/Jul/18 ] |
|
the releated test issue was fixed and remaining problems are LU-7612 |