[LU-4885] sanity-lfsck test 18d failed: (3) MDS1 is not the expected 'completed' Created: 11/Apr/14 Updated: 11/Apr/14 Resolved: 11/Apr/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | lfsck | ||
| Environment: |
Toro Autotest |
||
| Severity: | 3 |
| Rank (Obsolete): | 13520 |
| Description |
|
sanity-lfsck test 18d has failed a few times during autotesting. One such failure is at https://maloo.whamcloud.com/test_sets/01266ea2-c171-11e3-9451-52540035b04c . Expected LFSCK to end with status "completed", but got status "failed". From the client test_log: Update not seen after 32s: wanted 'completed' got 'failed' sanity-lfsck test_18d: @@@@@@ FAIL: (3) MDS1 is not the expected 'completed' From the client console, we see: 02:01:39:LustreError: 11-0: lustre-OST0000-osc-ffff88007a5d8c00: Communicating with 10.10.4.191@tcp, operation ldlm_enqueue failed with -12. The console message from MDS1 has: 02:00:14:LustreError: 17569:0:(lfsck_layout.c:1422:lfsck_layout_master_notify_others()) lustre-MDT0000-osd: fail to notify OST 7 for layout start: rc = -95 02:00:14:LustreError: 17569:0:(lfsck_layout.c:1551:lfsck_layout_master_notify_others()) lustre-MDT0000-osd: fail to notify MDT 7 for layout phase1 done: rc = -95 02:00:14:LustreError: 17569:0:(lfsck_layout.c:1343:lfsck_layout_master_query_others()) lustre-MDT0000-osd: fail to query MDT 7 for layout: rc = -95 02:00:14:LustreError: 17569:0:(lfsck_layout.c:1504:lfsck_layout_master_notify_others()) lustre-MDT0000-osd: fail to notify MDT 7 for layout stop/phase2: rc = -95 This failure looks a lot like |
| Comments |
| Comment by James Nunez (Inactive) [ 11/Apr/14 ] |
|
Note that there are other test 18d failures that have the same error message, but the status after 32 seconds is "scanning-phase2"; https://maloo.whamcloud.com/test_sets/d1a9ae82-ab04-11e3-bd80-52540035b04c and https://maloo.whamcloud.com/test_sets/e4b24d3e-b7f9-11e3-987f-52540035b04c . For these cases, we may just need to increase the wait time for LFSCK to complete. |
| Comment by nasf (Inactive) [ 11/Apr/14 ] |
|
It is another failure instance of |