[LU-3143] recovery-small test_29a: FAIL: import is not in FULL state Created: 10/Apr/13 Updated: 26/Jun/13 Resolved: 25/Apr/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Jian Yu | Assignee: | Hongchao Zhang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LB | ||
| Environment: |
Lustre Branch: master |
||
| Severity: | 3 |
| Rank (Obsolete): | 7629 |
| Description |
|
The recovery-small test 29a failed as follows: CMD: client-32vm3 lctl get_param -n at_min
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
error: list_param: /proc/{fs,sys}/{lnet,lustre}/osc/lustre-OST0000-osc-MDT0000/ost_server_uuid: Found no match
can't get osc.lustre-OST0000-osc-MDT0000.ost_server_uuid by list_param in 40 secs
Go with osc.lustre-OST0000-osc-MDT0000.ost_server_uuid directly
CMD: client-32vm3 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin: NAME=autotest_config sh rpc.sh wait_import_state FULL osc.lustre-OST0000-osc-MDT0000.ost_server_uuid 40
client-32vm3: rpc : @@@@@@ FAIL: can't put import for osc.lustre-OST0000-osc-MDT0000.ost_server_uuid into FULL state after 40 sec, have
Maloo report: https://maloo.whamcloud.com/test_sets/740b8974-a1bb-11e2-bdac-52540035b04c |
| Comments |
| Comment by Oleg Drokin [ 10/Apr/13 ] |
|
Can we get MDS conman output fo this test? it almost looks like MDS failed to start after failover, but there are not mds locks in the report at all so hard to tell the real cause. |
| Comment by Peter Jones [ 10/Apr/13 ] |
|
Hongchao Could you please look into this one? Thanks Peter |
| Comment by Jian Yu [ 11/Apr/13 ] |
|
It seems "$(facet_host $facet)" in wait_osc_import_state() should be changed to "$(facet_active_host $facet)". |
| Comment by Hongchao Zhang [ 12/Apr/13 ] |
|
the patch is tracked at http://review.whamcloud.com/#change,6036 |
| Comment by Peter Jones [ 17/Apr/13 ] |
|
Landed for 2.4 |
| Comment by Jian Yu [ 19/Apr/13 ] |
The Maloo report for patch set 2 showed that the recovery-small test 29a still failed. |
| Comment by Hongchao Zhang [ 19/Apr/13 ] |
|
there is still one more problem in the test script, the facet name of MDT is different when calling wait_osc_import_state test_29a() { # bug 22273 - error adding new clients
#define OBD_FAIL_TGT_CLIENT_ADD 0x711
do_facet $SINGLEMDS "lctl set_param fail_loc=0x80000711"
# fail abort so client will be new again
fail_abort $SINGLEMDS
client_up || error "reconnect failed"
wait_osc_import_state mds ost FULL <--- here, should use $SINGLEMDS (mds1)
return 0
}
there are some other places (mainly in conf-sanity.sh) with the same problem needed to fix, otherwise it could fail in failover mode with FAILURE_MODE=HARD the patch is tracked at http://review.whamcloud.com/#change,6100 |
| Comment by Peter Jones [ 25/Apr/13 ] |
|
Landed for 2.4 |
| Comment by Bruno Faccini (Inactive) [ 26/Jun/13 ] |
|
Just got a new occurrence in https://maloo.whamcloud.com/test_sets/e79d863e-de3e-11e2-b04c-52540035b04c. |