[LU-1724] Test failure on test suite performance-sanity, subtest test_3 Created: 08/Aug/12 Updated: 13/Aug/12 Resolved: 13/Aug/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Keith Mannthey (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 6353 |
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/0596b3e0-dcd3-11e1-8744-52540035b04c. The sub-test test_3 failed with the following error:
01:24:41:Lustre: DEBUG MARKER: performance-sanity test_3: @@@@@@ FAIL: test_3 failed with 10 |
| Comments |
| Comment by Sarah Liu [ 08/Aug/12 ] |
|
This error may caused by the previous failure of test mds-survey |
| Comment by Keith Mannthey (Inactive) [ 10/Aug/12 ] |
|
What happened to client-27vm3 of this test run? It was the mds and it paniced or ???? |
| Comment by Sarah Liu [ 10/Aug/12 ] |
|
There is a previous failure of mds-survey, it may cause the MDS abnormal. https://maloo.whamcloud.com/test_sets/7fa0e0bc-dcd2-11e1-8744-52540035b04c |
| Comment by Keith Mannthey (Inactive) [ 10/Aug/12 ] |
|
from the MDS of the 2nd run 00:54:47:Lustre: DEBUG MARKER: == mds-survey test 2: Metadata survey with stripe_count = 1 == 00:54:45 (1343894085) 00:54:50:Lustre: DEBUG MARKER: lctl dl 00:54:52:LustreError: 17365:0:(echo_client.c:1607:echo_md_lookup()) lookup tests: rc = -2 00:54:52:LustreError: 17365:0:(echo_client.c:1607:echo_md_lookup()) Skipped 2 previous similar messages 00:54:52:LustreError: 17365:0:(echo_client.c:1806:echo_md_destroy_internal()) Can't find child tests: rc = -2 00:54:52:LustreError: 17365:0:(echo_client.c:1806:echo_md_destroy_internal()) Skipped 2 previous similar messages 00:54:54:LustreError: 17387:0:(echo_client.c:1607:echo_md_lookup()) lookup tests1: rc = -2 00:54:54:LustreError: 17387:0:(echo_client.c:1806:echo_md_destroy_internal()) Can't find child tests1: rc = -2 01:04:40:lctl invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 The OOM killer was out running around. Once OOM is running the system behaviour becomes non-deterministic as it may choose different processes to kill. Why is the MDS running of of memory? |
| Comment by Peter Jones [ 10/Aug/12 ] |
|
I think that the theory is due to |
| Comment by Keith Mannthey (Inactive) [ 10/Aug/12 ] |
|
After reviewing |
| Comment by Peter Jones [ 13/Aug/12 ] |
|
Closing as a duplicate of |