[LU-7807] Failover - replay-dual test_15a: import is not in FULL state Created: 24/Feb/16  Updated: 11/Sep/20  Resolved: 11/Sep/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

Hard Failover: EL6.7 Server/SLES11 SP4 Clients
Server: b2_8, RHEL 6.7, build# 6
Client: SLES 11 SP4


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/f12c8e00-d65e-11e5-afe8-5254006e85c2.

The sub-test test_15a failed with the following error:

import is not in FULL state

test log:

Failover mds1 to onyx-32vm3
23:57:57 (1455782277) waiting for onyx-32vm3 network 900 secs ...
23:57:57 (1455782277) network interface is UP
CMD: onyx-32vm3 hostname
mount facets: mds1
CMD: onyx-32vm3 test -b /dev/lvm-Role_MDS/P1
Starting mds1:   /dev/lvm-Role_MDS/P1 /mnt/mds1
CMD: onyx-32vm3 mkdir -p /mnt/mds1; mount -t lustre   		                   /dev/lvm-Role_MDS/P1 /mnt/mds1
onyx-32vm3: mount.lustre: increased /sys/block/dm-0/queue/max_sectors_kb from 1024 to 16384
onyx-32vm3: mount.lustre: increased /sys/block/sda/queue/max_sectors_kb from 1024 to 16384
CMD: onyx-32vm3 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/mpi/gcc/openmpi/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 4 
CMD: onyx-32vm3 e2label /dev/lvm-Role_MDS/P1 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
CMD: onyx-32vm3 e2label /dev/lvm-Role_MDS/P1 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
CMD: onyx-32vm3 e2label /dev/lvm-Role_MDS/P1 2>/dev/null
Started lustre-MDT0000
CMD: onyx-32vm1,onyx-32vm5,onyx-32vm6 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/mpi/gcc/openmpi/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid 
onyx-32vm5: CMD: onyx-32vm5 lctl get_param -n at_max
onyx-32vm6: CMD: onyx-32vm6 lctl get_param -n at_max
onyx-32vm1: CMD: onyx-32vm1 lctl get_param -n at_max
onyx-32vm1: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 106 sec
onyx-32vm6: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 106 sec
onyx-32vm5:  rpc : @@@@@@ FAIL: can't put import for mdc.lustre-MDT0000-mdc-*.mds_server_uuid into FULL state after 1475 sec, have REPLAY
onyx-32vm5: FULL 
onyx-32vm5:   Trace dump:
onyx-32vm5:   = /usr/lib64/lustre/tests/test-framework.sh:4670:error_noexit()
onyx-32vm5:   = /usr/lib64/lustre/tests/test-framework.sh:4704:error()
onyx-32vm5:   = /usr/lib64/lustre/tests/test-framework.sh:5766:_wait_import_state()
onyx-32vm5:   = /usr/lib64/lustre/tests/test-framework.sh:5788:wait_import_state()
onyx-32vm5:   = /usr/lib64/lustre/tests/test-framework.sh:5797:wait_import_state_mount()
onyx-32vm5:   = rpc.sh:20:main()
onyx-32vm5: CMD: onyx-32vm5,onyx-32vm6,onyx-32vm7,onyx-32vm8 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:./../utils:/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/mpi/gcc/openmpi/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh check_logdir /tmp/test_logs/1455782336 
onyx-32vm5: CMD: onyx-32vm5 uname -n
onyx-32vm5: CMD: onyx-32vm6 uname -n
onyx-32vm5: Dumping lctl log to /tmp/test_logs/1455782336/rpc..*.1455783823.log
onyx-32vm5: CMD: onyx-32vm5,onyx-32vm6,onyx-32vm7,onyx-32vm8 /usr/sbin/lctl dk > /tmp/test_logs/1455782336/rpc..debug_log.\$(hostname -s).1455783823.log;
onyx-32vm5:          dmesg > /tmp/test_logs/1455782336/rpc..dmesg.\$(hostname -s).1455783823.log
onyx-32vm5: onyx-32vm7: invalid parameter 'dump_kernel'
onyx-32vm5: onyx-32vm7: open(dump_kernel) failed: No such file or directory
onyx-32vm5: CMD: onyx-32vm5,onyx-32vm6,onyx-32vm7,onyx-32vm8 rsync -az /tmp/test_logs/1455782336/rpc..*.1455783823.log onyx-32vm5:/tmp/test_logs/1455782336
onyx-32vm5: onyx-32vm5: Warning: Permanently added 'onyx-32vm5,10.2.4.115' (ECDSA) to the list of known hosts.
onyx-32vm5: onyx-32vm6: Warning: Permanently added 'onyx-32vm5,10.2.4.115' (ECDSA) to the list of known hosts.
onyx-32vm5: CMD: onyx-32vm5,onyx-32vm6,onyx-32vm7,onyx-32vm8 lctl set_param -n fail_loc=0 	    fail_val=0 2>/dev/null || true
onyx-32vm5: Resetting fail_loc on all nodes...done.
 replay-dual test_15a: @@@@@@ FAIL: import is not in FULL state 

May be related to LU-6935


Generated at Sat Feb 10 02:12:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.