[LU-6725] replay-single test_0a: FAIL: import is not in FULL state Created: 15/Jun/15  Updated: 07/Sep/16  Resolved: 13/Oct/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Blocker
Reporter: Jian Yu Assignee: Bob Glossman (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Lustre Build: https://build.hpdd.intel.com/job/lustre-master/3071
Distro/Arch: RHEL7.1/x86_64


Issue Links:
Duplicate
duplicates LU-6992 recovery-random-scale test_fail_clien... Resolved
Related
is related to LU-4241 Test failure on test recovery-small t... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

replay-single test 0a failed as follows:

Started lustre:MDT0000
CMD: onyx-42vm5,onyx-42vm6.onyx.hpdd.intel.com PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh wait_import_state_mount FULL mdc.lustre:MDT0000-mdc-*.mds_server_uuid 
onyx-42vm5: CMD: onyx-42vm5.onyx.hpdd.intel.com lctl get_param -n at_max
onyx-42vm6: CMD: onyx-42vm6.onyx.hpdd.intel.com lctl get_param -n at_max
onyx-42vm6:  rpc : @@@@@@ FAIL: can't put import for mdc.lustre:MDT0000-mdc-*.mds_server_uuid into FULL state after 1475 sec, have  
onyx-42vm5:  rpc : @@@@@@ FAIL: can't put import for mdc.lustre:MDT0000-mdc-*.mds_server_uuid into FULL state after 1475 sec, have  

Maloo report: https://testing.hpdd.intel.com/test_sets/d2bcff78-135d-11e5-b4b0-5254006e85c2



 Comments   
Comment by Oleg Drokin [ 16/Jun/15 ]

It looks like we see on MDS taht teh recoevry is done so the clients did reconnect.
could it be just proc file update glitch of some sort?

There's no reconnect message from clients to MDS, but there is one for MGS so perhaps its' teh message reduction logic that ate them?
Should be visible in the debug logs then.

Comment by Saurabh Tandan (Inactive) [ 13/Oct/15 ]

Another instance on 2.7.61 tag:
https://testing.hpdd.intel.com/test_sets/9f06c600-6dee-11e5-b960-5254006e85c2

Comment by Jian Yu [ 13/Oct/15 ]

The label on the MDT device was lustre:MDT0000, which had not been changed to lustre-MDT0000. So, this is a duplicate of LU-6992.

Generated at Sat Feb 10 02:02:42 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.