Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.4.0
-
FSTYPE=zfs
FAILURE_MODE=HARD
-
3
-
8083
Description
While running recovery-*-scale tests with FSTYPE=zfs and FAILURE_MODE=HARD under failover configuration, the tests failed as follows:
Failing mds1 on wtm-9vm3 + pm -h powerman --off wtm-9vm3 Command completed successfully waiting ! ping -w 3 -c 1 wtm-9vm3, 4 secs left ... waiting ! ping -w 3 -c 1 wtm-9vm3, 3 secs left ... waiting ! ping -w 3 -c 1 wtm-9vm3, 2 secs left ... waiting ! ping -w 3 -c 1 wtm-9vm3, 1 secs left ... waiting for wtm-9vm3 to fail attempts=3 + pm -h powerman --off wtm-9vm3 Command completed successfully reboot facets: mds1 + pm -h powerman --on wtm-9vm3 Command completed successfully Failover mds1 to wtm-9vm7 04:28:49 (1367234929) waiting for wtm-9vm7 network 900 secs ... 04:28:49 (1367234929) network interface is UP CMD: wtm-9vm7 hostname mount facets: mds1 Starting mds1: lustre-mdt1/mdt1 /mnt/mds1 CMD: wtm-9vm7 mkdir -p /mnt/mds1; mount -t lustre lustre-mdt1/mdt1 /mnt/mds1 wtm-9vm7: mount.lustre: lustre-mdt1/mdt1 has not been formatted with mkfs.lustre or the backend filesystem type is not supported by this tool Start of lustre-mdt1/mdt1 on mds1 failed 19
Maloo report: https://maloo.whamcloud.com/test_sets/ac7cbc10-b0e3-11e2-b2c4-52540035b04c
with REFORMAT=y FSTYPE=zfs sh llmount.sh -v I'm getting:
Format mds1: lustre-mdt1/mdt1
CMD: centos grep -c /mnt/mds1' ' /proc/mounts
CMD: centos lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
CMD: centos ! zpool list -H lustre-mdt1 >/dev/null 2>&1 ||
zpool export lustre-mdt1
CMD: centos /work/lustre/head1/lustre/tests/../utils/mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/work/lustre/head1/lustre/tests/../utils/l_getidentity --backfstype=zfs --device-size=200000 --reformat lustre-mdt1/mdt1 /tmp/lustre-mdt1
Permanent disk data:
Target: lustre:MDT0000
Index: 0
Lustre FS: lustre
Mount type: zfs
Flags: 0x65
(MDT MGS first_time update )
Persistent mount opts:
Parameters: sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/work/lustre/head1/lustre/tests/../utils/l_getidentity
mkfs_cmd = zpool create -f -O canmount=off lustre-mdt1 /tmp/lustre-mdt1
mkfs_cmd = zfs create -o canmount=off -o xattr=sa lustre-mdt1/mdt1
Writing lustre-mdt1/mdt1 properties
lustre:version=1
lustre:flags=101
lustre:index=0
lustre:fsname=lustre
lustre:svname=lustre:MDT0000
lustre:sys.timeout=20
lustre:lov.stripesize=1048576
lustre:lov.stripecount=0
lustre:mdt.identity_upcall=/work/lustre/head1/lustre/tests/../utils/l_getidentity
CMD: centos zpool set cachefile=none lustre-mdt1
CMD: centos ! zpool list -H lustre-mdt1 >/dev/null 2>&1 ||
zpool export lustre-mdt1
...
Loading modules from /work/lustre/head1/lustre/tests/..
detected 2 online CPUs by sysfs
Force libcfs to create 2 CPU partitions
debug=vfstrace rpctrace dlmtrace neterror ha config ioctl super
subsystem_debug=all -lnet -lnd -pinger
gss/krb5 is not supported
Setup mgs, mdt, osts
CMD: centos mkdir -p /mnt/mds1
CMD: centos zpool import -f -o cachefile=none lustre-mdt1
cannot import 'lustre-mdt1': no such pool available