[LU-8473] conf-sanity test_41a with separate MGS stuck on starting client and timed out Created: 02/Aug/16  Updated: 10/Oct/16  Resolved: 15/Aug/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.9.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Minor
Reporter: Jian Yu Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
Related
is related to LU-8688 All Lustre test suites should run/PAS... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

While running conf-sanity test 41a with separate MGT and MDT0000 devices, it failed as follows:

== conf-sanity test 41a: mount mds with --nosvc and --nomgs == 23:48:16 (1470181696)
../libcfs/libcfs/libcfs options: 'cpu_npartitions=2'
CMD: eagle-38vm3 test -b /dev/vda6
start mds service on eagle-38vm3
CMD: eagle-38vm3 mkdir -p /mnt/lustre-mds1
Loading modules from /usr/lib64/lustre
detected 2 online CPUs by sysfs
Force libcfs to create 2 CPU partitions
debug=-1
subsystem_debug=all -lnet -lnd -pinger
CMD: eagle-38vm3 test -b /dev/vda6
CMD: eagle-38vm3 e2label /dev/vda6
Starting mds1: -o nosvc -n  /dev/vda6 /mnt/lustre-mds1
CMD: eagle-38vm3 mkdir -p /mnt/lustre-mds1; mount -t lustre -o nosvc -n  		                   /dev/vda6 /mnt/lustre-mds1
nomtab: 1
CMD: eagle-38vm3 /usr/sbin/lctl get_param -n health_check
CMD: eagle-38vm3 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/openmpi/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin::/sbin:/bin:/usr/sbin: NAME=ncli sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 24 
Start /dev/vda6 without service
CMD: eagle-38vm3 e2label /dev/vda6 2>/dev/null
Started lustre-MDT0000
CMD: eagle-38vm4 mkdir -p /mnt/lustre-ost1
CMD: eagle-38vm4 test -b /dev/vda5
CMD: eagle-38vm4 e2label /dev/vda5
Starting ost1:   /dev/vda5 /mnt/lustre-ost1
CMD: eagle-38vm4 mkdir -p /mnt/lustre-ost1; mount -t lustre   		                   /dev/vda5 /mnt/lustre-ost1
CMD: eagle-38vm4 /usr/sbin/lctl get_param -n health_check
CMD: eagle-38vm4 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/openmpi/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin::/sbin:/bin:/usr/sbin: NAME=ncli sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 24 
CMD: eagle-38vm4 e2label /dev/vda5 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
CMD: eagle-38vm4 e2label /dev/vda5 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
CMD: eagle-38vm4 e2label /dev/vda5 2>/dev/null
Started lustre-OST0000
start mds service on eagle-38vm3
CMD: eagle-38vm3 mkdir -p /mnt/lustre-mds1
CMD: eagle-38vm3 test -b /dev/vda6
CMD: eagle-38vm3 e2label /dev/vda6
Starting mds1: -o nomgs,force  /dev/vda6 /mnt/lustre-mds1
CMD: eagle-38vm3 mkdir -p /mnt/lustre-mds1; mount -t lustre -o nomgs,force  		                   /dev/vda6 /mnt/lustre-mds1
eagle-38vm3: mount.lustre: mount /dev/vda6 at /mnt/lustre-mds1 failed: Operation already in progress
eagle-38vm3: The target service is already running. (/dev/vda6)
force: 1
Start of /dev/vda6 on mds1 failed 114
mount lustre on /mnt/lustre.....
Starting client: eagle-38vm1:  -o user_xattr,flock eagle-38vm3@tcp:/lustre /mnt/lustre
CMD: eagle-38vm1 mkdir -p /mnt/lustre
CMD: eagle-38vm1 mount -t lustre -o user_xattr,flock eagle-38vm3@tcp:/lustre /mnt/lustre

Maloo report: https://testing.hpdd.intel.com/test_sets/25ba3a02-590c-11e6-b2e2-5254006e85c2



 Comments   
Comment by Gerrit Updater [ 03/Aug/16 ]

Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/21651
Subject: LU-8473 tests: skip conf-sanity test 41a with separate MGT and MDT
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b902e6977fd81a2b961e6cec36f45bef6a10990a

Comment by Gerrit Updater [ 15/Aug/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21651/
Subject: LU-8473 tests: skip conf-sanity test 41a with separate MGT and MDT
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 14b23d67a71cf2aa0b571553171a0894c73f11e6

Comment by Peter Jones [ 15/Aug/16 ]

Landed for 2.9

Generated at Sat Feb 10 02:17:51 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.