Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.10.0
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for bfaccini <bruno.faccini@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/59eb46d2-3d9f-11e6-a0ce-5254006e85c2.
The sub-test test_93 failed with the following error:
test failed to respond and timed out
Test log indicates onyx-32vm3 is not responding correctly after parallel mount of MDS[2,4] :
== conf-sanity test 93: register mulitple MDT at the same time ======================================= 15:20:48 (1467152448) Stopping clients: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2 /mnt/lustre (opts:) CMD: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2 running=\$(grep -c /mnt/lustre' ' /proc/mounts); if [ \$running -ne 0 ] ; then echo Stopping client \$(hostname) /mnt/lustre opts:; lsof /mnt/lustre || need_kill=no; if [ x != x -a x\$need_kill != xno ]; then pids=\$(lsof -t /mnt/lustre | sort -u); if [ -n \"\$pids\" ]; then kill -9 \$pids; fi fi; while umount /mnt/lustre 2>&1 | grep -q busy; do echo /mnt/lustre is still busy, wait one second && sleep 1; done; fi Stopping clients: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2 /mnt/lustre2 (opts:) CMD: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2 running=\$(grep -c /mnt/lustre2' ' /proc/mounts); if [ \$running -ne 0 ] ; then echo Stopping client \$(hostname) /mnt/lustre2 opts:; lsof /mnt/lustre2 || need_kill=no; if [ x != x -a x\$need_kill != xno ]; then pids=\$(lsof -t /mnt/lustre2 | sort -u); if [ -n \"\$pids\" ]; then kill -9 \$pids; fi fi; while umount /mnt/lustre2 2>&1 | grep -q busy; do echo /mnt/lustre2 is still busy, wait one second && sleep 1; done; fi CMD: onyx-32vm7 grep -c /mnt/lustre-mds1' ' /proc/mounts CMD: onyx-32vm7 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm3 grep -c /mnt/lustre-mds2' ' /proc/mounts CMD: onyx-32vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm7 grep -c /mnt/lustre-mds3' ' /proc/mounts CMD: onyx-32vm7 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm3 grep -c /mnt/lustre-mds4' ' /proc/mounts CMD: onyx-32vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 grep -c /mnt/lustre-ost1' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 grep -c /mnt/lustre-ost2' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 grep -c /mnt/lustre-ost3' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 grep -c /mnt/lustre-ost4' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 grep -c /mnt/lustre-ost5' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 grep -c /mnt/lustre-ost6' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 grep -c /mnt/lustre-ost7' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 grep -c /mnt/lustre-ost8' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm2,onyx-32vm3,onyx-32vm7,onyx-32vm8 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_hostid Loading modules from /usr/lib64/lustre detected 2 online CPUs by sysfs Force libcfs to create 2 CPU partitions debug=-1 subsystem_debug=all -lnet -lnd -pinger Formatting mgs, mds, osts Format mds1: /dev/lvm-Role_MDS/P1 CMD: onyx-32vm7 grep -c /mnt/lustre-mds1' ' /proc/mounts CMD: onyx-32vm7 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm7 mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_MDS/P1 Permanent disk data: Target: lustre:MDT0000 Index: 0 Lustre FS: lustre Mount type: ldiskfs Flags: 0x65 (MDT MGS first_time update ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/usr/sbin/l_getidentity device size = 2048MB formatting backing filesystem ldiskfs on /dev/lvm-Role_MDS/P1 target name lustre:MDT0000 4k blocks 50000 options -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0000 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F /dev/lvm-Role_MDS/P1 50000 Writing CONFIGS/mountdata Format mds2: /dev/lvm-Role_MDS/P2 CMD: onyx-32vm3 grep -c /mnt/lustre-mds2' ' /proc/mounts CMD: onyx-32vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm3 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --mdt --index=1 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_MDS/P2 Permanent disk data: Target: lustre:MDT0001 Index: 1 Lustre FS: lustre Mount type: ldiskfs Flags: 0x61 (MDT first_time update ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/usr/sbin/l_getidentity device size = 2048MB formatting backing filesystem ldiskfs on /dev/lvm-Role_MDS/P2 target name lustre:MDT0001 4k blocks 50000 options -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0001 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F /dev/lvm-Role_MDS/P2 50000 Writing CONFIGS/mountdata Format mds3: /dev/lvm-Role_MDS/P3 CMD: onyx-32vm7 grep -c /mnt/lustre-mds3' ' /proc/mounts CMD: onyx-32vm7 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm7 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --mdt --index=2 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_MDS/P3 Permanent disk data: Target: lustre:MDT0002 Index: 2 Lustre FS: lustre Mount type: ldiskfs Flags: 0x61 (MDT first_time update ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/usr/sbin/l_getidentity device size = 2048MB formatting backing filesystem ldiskfs on /dev/lvm-Role_MDS/P3 target name lustre:MDT0002 4k blocks 50000 options -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0002 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F /dev/lvm-Role_MDS/P3 50000 Writing CONFIGS/mountdata Format mds4: /dev/lvm-Role_MDS/P4 CMD: onyx-32vm3 grep -c /mnt/lustre-mds4' ' /proc/mounts CMD: onyx-32vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm3 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --mdt --index=3 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_MDS/P4 Permanent disk data: Target: lustre:MDT0003 Index: 3 Lustre FS: lustre Mount type: ldiskfs Flags: 0x61 (MDT first_time update ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/usr/sbin/l_getidentity device size = 2048MB formatting backing filesystem ldiskfs on /dev/lvm-Role_MDS/P4 target name lustre:MDT0003 4k blocks 50000 options -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0003 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F /dev/lvm-Role_MDS/P4 50000 Writing CONFIGS/mountdata Format ost1: /dev/lvm-Role_OSS/P1 CMD: onyx-32vm8 grep -c /mnt/lustre-ost1' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=0 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P1 Permanent disk data: Target: lustre:OST0000 Index: 0 Lustre FS: lustre Mount type: ldiskfs Flags: 0x62 (OST first_time update ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 device size = 9912MB formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P1 target name lustre:OST0000 4k blocks 50000 options -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0000 -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P1 50000 Writing CONFIGS/mountdata Format ost2: /dev/lvm-Role_OSS/P2 CMD: onyx-32vm8 grep -c /mnt/lustre-ost2' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=1 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P2 Permanent disk data: Target: lustre:OST0001 Index: 1 Lustre FS: lustre Mount type: ldiskfs Flags: 0x62 (OST first_time update ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 device size = 9912MB formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P2 target name lustre:OST0001 4k blocks 50000 options -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0001 -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P2 50000 Writing CONFIGS/mountdata Format ost3: /dev/lvm-Role_OSS/P3 CMD: onyx-32vm8 grep -c /mnt/lustre-ost3' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=2 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P3 Permanent disk data: Target: lustre:OST0002 Index: 2 Lustre FS: lustre Mount type: ldiskfs Flags: 0x62 (OST first_time update ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 device size = 9912MB formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P3 target name lustre:OST0002 4k blocks 50000 options -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0002 -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P3 50000 Writing CONFIGS/mountdata Format ost4: /dev/lvm-Role_OSS/P4 CMD: onyx-32vm8 grep -c /mnt/lustre-ost4' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=3 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P4 Permanent disk data: Target: lustre:OST0003 Index: 3 Lustre FS: lustre Mount type: ldiskfs Flags: 0x62 (OST first_time update ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 device size = 9912MB formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P4 target name lustre:OST0003 4k blocks 50000 options -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0003 -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P4 50000 Writing CONFIGS/mountdata Format ost5: /dev/lvm-Role_OSS/P5 CMD: onyx-32vm8 grep -c /mnt/lustre-ost5' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=4 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P5 Permanent disk data: Target: lustre:OST0004 Index: 4 Lustre FS: lustre Mount type: ldiskfs Flags: 0x62 (OST first_time update ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 device size = 9912MB formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P5 target name lustre:OST0004 4k blocks 50000 options -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0004 -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P5 50000 Writing CONFIGS/mountdata Format ost6: /dev/lvm-Role_OSS/P6 CMD: onyx-32vm8 grep -c /mnt/lustre-ost6' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=5 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P6 Permanent disk data: Target: lustre:OST0005 Index: 5 Lustre FS: lustre Mount type: ldiskfs Flags: 0x62 (OST first_time update ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 device size = 9912MB formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P6 target name lustre:OST0005 4k blocks 50000 options -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0005 -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P6 50000 Writing CONFIGS/mountdata Format ost7: /dev/lvm-Role_OSS/P7 CMD: onyx-32vm8 grep -c /mnt/lustre-ost7' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=6 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P7 Permanent disk data: Target: lustre:OST0006 Index: 6 Lustre FS: lustre Mount type: ldiskfs Flags: 0x62 (OST first_time update ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 device size = 9912MB formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P7 target name lustre:OST0006 4k blocks 50000 options -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0006 -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P7 50000 Writing CONFIGS/mountdata Format ost8: /dev/lvm-Role_OSS/P8 CMD: onyx-32vm8 grep -c /mnt/lustre-ost8' ' /proc/mounts CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=7 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P8 Permanent disk data: Target: lustre:OST0007 Index: 7 Lustre FS: lustre Mount type: ldiskfs Flags: 0x62 (OST first_time update ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 device size = 9912MB formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P8 target name lustre:OST0007 4k blocks 50000 options -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0007 -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P8 50000 Writing CONFIGS/mountdata start mds service on onyx-32vm7 CMD: onyx-32vm7 mkdir -p /mnt/lustre-mds1 CMD: onyx-32vm7 test -b /dev/lvm-Role_MDS/P1 CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P1 Starting mds1: /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1 CMD: onyx-32vm7 mkdir -p /mnt/lustre-mds1; mount -t lustre /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1 CMD: onyx-32vm7 /usr/sbin/lctl get_param -n health_check CMD: onyx-32vm7 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 4 CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' CMD: onyx-32vm7 sync; sync; sync CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P1 2>/dev/null Started lustre-MDT0000 start ost1 service on onyx-32vm8 CMD: onyx-32vm8 mkdir -p /mnt/lustre-ost1 CMD: onyx-32vm8 test -b /dev/lvm-Role_OSS/P1 CMD: onyx-32vm8 e2label /dev/lvm-Role_OSS/P1 Starting ost1: /dev/lvm-Role_OSS/P1 /mnt/lustre-ost1 CMD: onyx-32vm8 mkdir -p /mnt/lustre-ost1; mount -t lustre /dev/lvm-Role_OSS/P1 /mnt/lustre-ost1 CMD: onyx-32vm8 /usr/sbin/lctl get_param -n health_check CMD: onyx-32vm8 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 4 CMD: onyx-32vm8 e2label /dev/lvm-Role_OSS/P1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' CMD: onyx-32vm8 e2label /dev/lvm-Role_OSS/P1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' CMD: onyx-32vm8 sync; sync; sync CMD: onyx-32vm8 e2label /dev/lvm-Role_OSS/P1 2>/dev/null Started lustre-OST0000 CMD: onyx-32vm7 /usr/sbin/lctl set_param fail_val = 10 fail_loc=0x8000090e onyx-32vm7: error: set_param: setting /proc/sys/lnet/fail_val==: Invalid argument onyx-32vm7: error: set_param: param_path '10': No such file or directory mount lustre on /mnt/lustre..... Starting client: onyx-32vm1.onyx.hpdd.intel.com: -o user_xattr,flock onyx-32vm7@tcp:/lustre /mnt/lustre CMD: onyx-32vm1.onyx.hpdd.intel.com mkdir -p /mnt/lustre start mds service on onyx-32vm7 start mds service on onyx-32vm3 start mds service on onyx-32vm3 CMD: onyx-32vm1.onyx.hpdd.intel.com mount -t lustre -o user_xattr,flock onyx-32vm7@tcp:/lustre /mnt/lustre CMD: onyx-32vm7 mkdir -p /mnt/lustre-mds3 CMD: onyx-32vm3 mkdir -p /mnt/lustre-mds2 CMD: onyx-32vm3 mkdir -p /mnt/lustre-mds4 CMD: onyx-32vm3 test -b /dev/lvm-Role_MDS/P2 CMD: onyx-32vm3 test -b /dev/lvm-Role_MDS/P4 CMD: onyx-32vm7 test -b /dev/lvm-Role_MDS/P3 CMD: onyx-32vm3 e2label /dev/lvm-Role_MDS/P4 CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P3 CMD: onyx-32vm3 e2label /dev/lvm-Role_MDS/P2 Starting mds3: /dev/lvm-Role_MDS/P3 /mnt/lustre-mds3 CMD: onyx-32vm7 mkdir -p /mnt/lustre-mds3; mount -t lustre /dev/lvm-Role_MDS/P3 /mnt/lustre-mds3 Starting mds4: /dev/lvm-Role_MDS/P4 /mnt/lustre-mds4 CMD: onyx-32vm3 mkdir -p /mnt/lustre-mds4; mount -t lustre /dev/lvm-Role_MDS/P4 /mnt/lustre-mds4 Starting mds2: /dev/lvm-Role_MDS/P2 /mnt/lustre-mds2 CMD: onyx-32vm3 mkdir -p /mnt/lustre-mds2; mount -t lustre /dev/lvm-Role_MDS/P2 /mnt/lustre-mds2 CMD: onyx-32vm7 /usr/sbin/lctl get_param -n health_check CMD: onyx-32vm7 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 4 CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P3 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P3 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' CMD: onyx-32vm7 sync; sync; sync CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P3 2>/dev/null Started lustre-MDT0002 CMD: onyx-32vm7 lctl list_param osc.lustre-OST*-osc > /dev/null 2>&1 CMD: onyx-32vm7 lctl get_param -n at_min CMD: onyx-32vm7 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh wait_import_state FULL osc.lustre-OST0000-osc-MDT0000.ost_server_uuid 40 onyx-32vm7: osc.lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 0 sec CMD: onyx-32vm3 /usr/sbin/lctl lustre_build_version pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused /usr/lib64/lustre/tests/test-framework.sh: line 382: ( << 16) | ( << 8) | : syntax error: operand expected (error token is "<< 16) | ( << 8) | ") /usr/lib64/lustre/tests/test-framework.sh: line 5818: [: -le: unary operator expected CMD: onyx-32vm3 /usr/sbin/lctl lustre_build_version pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused /usr/lib64/lustre/tests/test-framework.sh: line 382: ( << 16) | ( << 8) | : syntax error: operand expected (error token is "<< 16) | ( << 8) | ") /usr/lib64/lustre/tests/test-framework.sh: line 5803: [: -gt: unary operator expected CMD: onyx-32vm3 lctl get_param -n at_min pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused CMD: onyx-32vm3 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh wait_import_state FULL osc.lustre-OST0000-osc-MDT0001.ost_server_uuid 40 pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused conf-sanity test_93: @@@@@@ FAIL: import is not in FULL state Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:4785:error() = /usr/lib64/lustre/tests/test-framework.sh:5976:_wait_osc_import_state() = /usr/lib64/lustre/tests/test-framework.sh:5991:wait_osc_import_state() = /usr/lib64/lustre/tests/conf-sanity.sh:6467:test_93() = /usr/lib64/lustre/tests/test-framework.sh:5049:run_one() = /usr/lib64/lustre/tests/test-framework.sh:5088:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:4935:run_test() = /usr/lib64/lustre/tests/conf-sanity.sh:6473:main() Dumping lctl log to /logdir/test_logs/2016-06-28/lustre-reviews-el7-x86_64--review-dne-part-1--1_6_1__40104__-69939819083780-054108/conf-sanity.test_93.*.1467152498.log CMD: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2,onyx-32vm3,onyx-32vm7,onyx-32vm8 /usr/sbin/lctl dk > /logdir/test_logs/2016-06-28/lustre-reviews-el7-x86_64--review-dne-part-1--1_6_1__40104__-69939819083780-054108/conf-sanity.test_93.debug_log.\$(hostname -s).1467152498.log; dmesg > /logdir/test_logs/2016-06-28/lustre-reviews-el7-x86_64--review-dne-part-1--1_6_1__40104__-69939819083780-054108/conf-sanity.test_93.dmesg.\$(hostname -s).1467152498.log pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused Resetting fail_loc on all nodes...CMD: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2,onyx-32vm3,onyx-32vm7,onyx-32vm8 lctl set_param -n fail_loc=0 fail_val=0 2>/dev/null || true pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused done.
This occurs during conf-sanity/test_93 :
test_93() { [ $MDSCOUNT -lt 3 ] && skip "needs >= 3 MDTs" && return reformat #start mgs or mgs/mdt0 if ! combined_mgs_mds ; then start_mgs start_mdt 1 else start_mdt 1 fi start_ost || error "OST0 start fail" #define OBD_FAIL_MGS_WRITE_TARGET_DELAY 0x90e do_facet mgs "$LCTL set_param fail_val = 10 fail_loc=0x8000090e" for num in $(seq 2 $MDSCOUNT); do start_mdt $num & <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< done mount_client $MOUNT || error "mount client fails" wait_osc_import_state mds ost FULL wait_osc_import_state client ost FULL check_mount || error "check_mount failed" cleanup || error "cleanup failed with $?" } run_test 93 "register mulitple MDT at the same time"
and the reason of the failure is the following crash/LBUG found in onyx-32vm3/MDS Console log :
15:21:36:[29395.748697] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4 15:21:36:[29395.753612] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2 15:21:36:[29396.019926] Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P4 15:21:36:[29396.024718] Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P2 15:21:36:[29396.306479] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P2 15:21:36:[29396.311613] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P4 15:21:36:[29396.577947] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre /dev/lvm-Role_MDS/P2 /mnt/lustre-mds2 15:21:36:[29396.594860] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4; mount -t lustre /dev/lvm-Role_MDS/P4 /mnt/lustre-mds4 15:21:36:[29396.743622] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: errors=remount-ro 15:21:36:[29396.750879] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro 15:21:36:[29396.909772] LustreError: 26347:0:(osd_handler.c:6468:osd_device_init0()) ASSERTION( info ) failed: 15:21:36:[29396.912150] LustreError: 26347:0:(osd_handler.c:6468:osd_device_init0()) LBUG 15:21:36:[29396.915016] Pid: 26347, comm: mount.lustre 15:21:36:[29396.919614] 15:21:36:[29396.919614] Call Trace: 15:21:36:[29396.922958] [<ffffffffa05e67d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs] 15:21:36:[29396.925401] [<ffffffffa05e6d75>] lbug_with_loc+0x45/0xc0 [libcfs] 15:21:36:[29396.927434] [<ffffffffa0c24ccf>] osd_device_alloc+0x70f/0x880 [osd_ldiskfs] 15:21:36:[29396.929611] [<ffffffffa07cd104>] obd_setup+0x114/0x2a0 [obdclass] 15:21:36:[29396.931618] [<ffffffffa07cfb54>] class_setup+0x2f4/0x8d0 [obdclass] 15:21:36:[29396.933586] [<ffffffffa07d3ee7>] class_process_config+0x1de7/0x2f70 [obdclass] 15:21:36:[29396.935800] [<ffffffffa05f1957>] ? libcfs_debug_msg+0x57/0x80 [libcfs] 15:21:36:[29396.937938] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache 15:21:36:[29396.937939] [<ffffffffa07dcb69>] do_lcfg+0x159/0x5d0 [obdclass] 15:21:36:[29396.937954] [<ffffffffa07dd928>] lustre_start_simple+0x88/0x210 [obdclass] 15:21:36:[29396.937972] [<ffffffffa0808ac4>] server_fill_super+0xf24/0x184c [obdclass] 15:21:36:[29396.937977] [<ffffffffa05f1957>] ? libcfs_debug_msg+0x57/0x80 [libcfs] 15:21:36:[29396.937991] [<ffffffffa07e09e8>] lustre_fill_super+0x328/0x950 [obdclass] 15:21:36:[29396.938013] [<ffffffffa07e06c0>] ? lustre_fill_super+0x0/0x950 [obdclass] 15:21:36:[29396.938019] [<ffffffff811e1f2d>] mount_nodev+0x4d/0xb0 15:21:36:[29396.938033] [<ffffffffa07d8918>] lustre_mount+0x38/0x60 [obdclass] 15:21:36:[29396.938034] [<ffffffff811e28d9>] mount_fs+0x39/0x1b0 15:21:36:[29396.938038] [<ffffffff811fe1af>] vfs_kern_mount+0x5f/0xf0 15:21:36:[29396.938039] [<ffffffff812006fe>] do_mount+0x24e/0xa40 15:21:36:[29396.938043] [<ffffffff8116e15e>] ? __get_free_pages+0xe/0x50 15:21:36:[29396.938044] [<ffffffff81200f86>] SyS_mount+0x96/0xf0 15:21:36:[29396.938048] [<ffffffff816463c9>] system_call_fastpath+0x16/0x1b 15:21:36:[29396.938048] 15:21:36:[29396.968797] Kernel panic - not syncing: LBUG 15:21:36:[29396.969781] CPU: 0 PID: 26347 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-327.18.2.el7_lustre.x86_64 #1 15:21:36:[29396.969781] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 15:21:36:[29396.969781] ffffffffa0603def 0000000048fb9a4f ffff880039073950 ffffffff81635c14 15:21:36:[29396.969781] ffff8800390739d0 ffffffff8162f48a ffffffff00000008 ffff8800390739e0 15:21:36:[29396.969781] ffff880039073980 0000000048fb9a4f ffffffffa0c511a0 0000000000000246 15:21:36:[29396.969781] Call Trace: 15:21:36:[29396.969781] [<ffffffff81635c14>] dump_stack+0x19/0x1b 15:21:36:[29396.969781] [<ffffffff8162f48a>] panic+0xd8/0x1e7 15:21:36:[29396.969781] [<ffffffffa05e6ddb>] lbug_with_loc+0xab/0xc0 [libcfs] 15:21:36:[29396.969781] [<ffffffffa0c24ccf>] osd_device_alloc+0x70f/0x880 [osd_ldiskfs] 15:21:36:[29396.969781] [<ffffffffa07cd104>] obd_setup+0x114/0x2a0 [obdclass] 15:21:36:[29396.969781] [<ffffffffa07cfb54>] class_setup+0x2f4/0x8d0 [obdclass] 15:21:36:[29396.969781] [<ffffffffa07d3ee7>] class_process_config+0x1de7/0x2f70 [obdclass] 15:21:36:[29396.969781] [<ffffffffa05f1957>] ? libcfs_debug_msg+0x57/0x80 [libcfs] 15:21:36:[29396.969781] [<ffffffffa07dcb69>] do_lcfg+0x159/0x5d0 [obdclass] 15:21:36:[29396.969781] [<ffffffffa07dd928>] lustre_start_simple+0x88/0x210 [obdclass] 15:21:36:[29396.969781] [<ffffffffa0808ac4>] server_fill_super+0xf24/0x184c [obdclass] 15:21:36:[29396.969781] [<ffffffffa05f1957>] ? libcfs_debug_msg+0x57/0x80 [libcfs] 15:21:36:[29396.969781] [<ffffffffa07e09e8>] lustre_fill_super+0x328/0x950 [obdclass] 15:21:36:[29396.969781] [<ffffffffa07e06c0>] ? lustre_common_put_super+0x270/0x270 [obdclass] 15:21:36:[29396.969781] [<ffffffff811e1f2d>] mount_nodev+0x4d/0xb0 15:21:36:[29396.969781] [<ffffffffa07d8918>] lustre_mount+0x38/0x60 [obdclass] 15:21:36:[29396.969781] [<ffffffff811e28d9>] mount_fs+0x39/0x1b0 15:21:36:[29396.969781] [<ffffffff811fe1af>] vfs_kern_mount+0x5f/0xf0 15:21:36:[29396.969781] [<ffffffff812006fe>] do_mount+0x24e/0xa40 15:21:36:[29396.969781] [<ffffffff8116e15e>] ? __get_free_pages+0xe/0x50 15:21:36:[29396.969781] [<ffffffff81200f86>] SyS_mount+0x96/0xf0 15:21:36:[29396.969781] [<ffffffff816463c9>] system_call_fastpath+0x16/0x1b
Info required for matching: conf-sanity 93
Attachments
Issue Links
- duplicates
-
LU-11814 conf-sanity test_93 osd_handler.c:7132:osd_device_init0()) ASSERTION( info ) failed:
- Resolved
-
LU-12300 conf-sanity test 93: osd_handler.c:7743:osd_device_init0()) ASSERTION( info ) failed
- Resolved
- is related to
-
LU-13313 conf-sanity test_93: Crashed while parallel mounting
- Resolved
- is related to
-
LU-11089 Performance improvements for lu_object locking
- Resolved
- mentioned in
-
Page Loading...