Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.11.0
-
None
-
3
-
9223372036854775807
Description
After LU-684 https://review.whamcloud.com/#/c/7200/ where dm-flakey layer was added to test-framework, conf-sanity didn`t pass with a real devices.
Example of configuration at local.sh
MDSCOUNT=1 OSTCOUNT=2 mds1_HOST=fre0101 MDSDEV1=/dev/vdb mds_HOST=fre0101 MDSDEV=/dev/vdb ost1_HOST=fre0102 OSTDEV1=/dev/vdb ost2_HOST=fre0102 OSTDEV2=/dev/vdc .....
Errors:
CMD: fre0205,fre0206,fre0208 PATH=/usr/lib64/lustre/tests/../tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/../tests/mpi:/usr/lib64/lustre/tests/../tests/racer:/usr/lib64/lustre/tests/../../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests/../tests:/usr/lib64/lustre/tests/../utils/gss:/root//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/usr/lib64/mpich/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin::/sbin:/bin:/usr/sbin: NAME=ncli sh rpc.sh set_hostid fre0208: fre0208: executing set_hostid fre0205: fre0205: executing set_hostid fre0206: fre0206: executing set_hostid CMD: fre0205 [ -e "/dev/vdb" ] CMD: fre0205 grep -c /mnt/lustre-mgs' ' /proc/mounts || true CMD: fre0205 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' || true CMD: fre0205 e2label /dev/vdb CMD: fre0205 mkfs.lustre --mgs --param=sys.timeout=20 --backfstype=ldiskfs --device-size=0 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/vdb fre0205: fre0205: mkfs.lustre FATAL: Unable to build fs /dev/vdb (256) fre0205: fre0205: mkfs.lustre FATAL: mkfs failed 256
A quick look shows that reformat is fine at conf-sanity with the next change to t-f
formatall() {
CLEANUP_DM_DEV=true stopall -f
since there are a lot of stopall at conf-sanity, they requires a fix also, probably.
== conf-sanity test 17: Verify failed mds_postsetup won't fail assertion (2936) (should return errs) ====================================================================================================== 15:36:46 (1522942606) start mds service on fre0113 Starting mds1: -o rw,user_xattr /dev/mapper/mds1_flakey /mnt/lustre-mds1 fre0113: fre0113: executing set_default_debug -1 all 4 pdsh@fre0115: fre0113: ssh exited with exit code 1 pdsh@fre0115: fre0113: ssh exited with exit code 1 Started lustre-MDT0000 start mds service on fre0113 Starting mds2: -o rw,user_xattr /dev/mapper/mds2_flakey /mnt/lustre-mds2 fre0113: fre0113: executing set_default_debug -1 all 4 pdsh@fre0115: fre0113: ssh exited with exit code 1 pdsh@fre0115: fre0113: ssh exited with exit code 1 Started lustre-MDT0001 start ost1 service on fre0114 Starting ost1: -o user_xattr /dev/mapper/ost1_flakey /mnt/lustre-ost1 fre0114: fre0114: executing set_default_debug -1 all 4 pdsh@fre0115: fre0114: ssh exited with exit code 1 pdsh@fre0115: fre0114: ssh exited with exit code 1 Started lustre-OST0000 mount lustre on /mnt/lustre..... Starting client: fre0115: -o user_xattr,flock fre0113@tcp:/lustre /mnt/lustre setup single mount lustre success umount lustre on /mnt/lustre..... Stopping client fre0115 /mnt/lustre (opts:) stop ost1 service on fre0114 Stopping /mnt/lustre-ost1 (opts:-f) on fre0114 stop mds service on fre0113 Stopping /mnt/lustre-mds1 (opts:-f) on fre0113 stop mds service on fre0113 Stopping /mnt/lustre-mds2 (opts:-f) on fre0113 modules unloaded. Remove mds config log Stopping /mnt/lustre-mgs (opts:) on fre0113 fre0113: debugfs 1.42.13.x6 (01-Mar-2018) start mgs service on fre0113 Loading modules from /usr/lib64/lustre/tests/.. detected 2 online CPUs by sysfs Force libcfs to create 2 CPU partitions ../libcfs/libcfs/libcfs options: 'cpu_npartitions=2' ../lnet/lnet/lnet options: 'accept=all' ../lnet/klnds/socklnd/ksocklnd options: 'sock_timeout=10' gss/krb5 is not supported Starting mgs: /dev/mapper/mgs_flakey /mnt/lustre-mgs fre0113: fre0113: executing set_default_debug -1 all 4 pdsh@fre0115: fre0113: ssh exited with exit code 1 pdsh@fre0115: fre0113: ssh exited with exit code 1 Started MGS start ost1 service on fre0114 Starting ost1: -o user_xattr /dev/mapper/ost1_flakey /mnt/lustre-ost1 fre0114: fre0114: executing set_default_debug -1 all 4 pdsh@fre0115: fre0114: ssh exited with exit code 1 pdsh@fre0115: fre0114: ssh exited with exit code 1 Started lustre-OST0000 start mds service on fre0113 Starting mds1: -o rw,user_xattr /dev/mapper/mds1_flakey /mnt/lustre-mds1 fre0113: mount.lustre: mount /dev/mapper/mds1_flakey at /mnt/lustre-mds1 failed: No such file or directory fre0113: Is the MGS specification correct? fre0113: Is the filesystem name correct? fre0113: If upgrading, is the copied client log valid? (see upgrade docs) pdsh@fre0115: fre0113: ssh exited with exit code 2 Start of /dev/mapper/mds1_flakey on mds1 failed 2 Stopping clients: fre0115,fre0116 /mnt/lustre (opts:-f) Stopping clients: fre0115,fre0116 /mnt/lustre2 (opts:-f) Stopping /mnt/lustre-ost1 (opts:-f) on fre0114 pdsh@fre0115: fre0114: ssh exited with exit code 1 Stopping /mnt/lustre-mgs (opts:) on fre0113 fre0114: fre0114: executing set_hostid fre0116: fre0116: executing set_hostid fre0113: fre0113: executing set_hostid Loading modules from /usr/lib64/lustre/tests/.. detected 2 online CPUs by sysfs Force libcfs to create 2 CPU partitions gss/krb5 is not supported Formatting mgs, mds, osts Format mgs: /dev/mapper/mgs_flakey pdsh@fre0115: fre0113: ssh exited with exit code 1 conf-sanity test_17: @@@@@@ FAIL: mgs: device '/dev/mapper/mgs_flakey' does not exist Trace dump: = /usr/lib64/lustre/tests/../tests/test-framework.sh:5734:error() = /usr/lib64/lustre/tests/../tests/test-framework.sh:4314:__touch_device() = /usr/lib64/lustre/tests/../tests/test-framework.sh:4331:format_mgs() = /usr/lib64/lustre/tests/../tests/test-framework.sh:4384:formatall() = /usr/lib64/lustre/tests/conf-sanity.sh:109:reformat() = /usr/lib64/lustre/tests/conf-sanity.sh:91:reformat_and_config() = /usr/lib64/lustre/tests/conf-sanity.sh:605:test_17() = /usr/lib64/lustre/tests/../tests/test-framework.sh:6010:run_one() = /usr/lib64/lustre/tests/../tests/test-framework.sh:6049:run_one_logged() = /usr/lib64/lustre/tests/../tests/test-framework.sh:5848:run_test() = /usr/lib64/lustre/tests/conf-sanity.sh:607:main() Dumping lctl log to /tmp/test_logs/1522942566/conf-sanity.test_17.*.1522942656.log fre0114: Warning: Permanently added 'fre0115,192.168.101.15' (ECDSA) to the list of known hosts. fre0116: Warning: Permanently added 'fre0115,192.168.101.15' (ECDSA) to the list of known hosts. fre0113: Warning: Permanently added 'fre0115,192.168.101.15' (ECDSA) to the list of known hosts. Resetting fail_loc on all nodes...done. FAIL 17 (51s)
Attachments
Issue Links
- is related to
-
LU-684 replace dev_rdonly kernel patch with dm-flakey
- Resolved