Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8346

conf-sanity test_93: test failed to respond and timed out

Details

    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for bfaccini <bruno.faccini@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/59eb46d2-3d9f-11e6-a0ce-5254006e85c2.

      The sub-test test_93 failed with the following error:

      test failed to respond and timed out
      

      Test log indicates onyx-32vm3 is not responding correctly after parallel mount of MDS[2,4] :

      == conf-sanity test 93: register mulitple MDT at the same time ======================================= 15:20:48 (1467152448)
      Stopping clients: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2 /mnt/lustre (opts:)
      CMD: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2 running=\$(grep -c /mnt/lustre' ' /proc/mounts);
      if [ \$running -ne 0 ] ; then
      echo Stopping client \$(hostname) /mnt/lustre opts:;
      lsof /mnt/lustre || need_kill=no;
      if [ x != x -a x\$need_kill != xno ]; then
          pids=\$(lsof -t /mnt/lustre | sort -u);
          if [ -n \"\$pids\" ]; then
                   kill -9 \$pids;
          fi
      fi;
      while umount  /mnt/lustre 2>&1 | grep -q busy; do
          echo /mnt/lustre is still busy, wait one second && sleep 1;
      done;
      fi
      Stopping clients: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2 /mnt/lustre2 (opts:)
      CMD: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2 running=\$(grep -c /mnt/lustre2' ' /proc/mounts);
      if [ \$running -ne 0 ] ; then
      echo Stopping client \$(hostname) /mnt/lustre2 opts:;
      lsof /mnt/lustre2 || need_kill=no;
      if [ x != x -a x\$need_kill != xno ]; then
          pids=\$(lsof -t /mnt/lustre2 | sort -u);
          if [ -n \"\$pids\" ]; then
                   kill -9 \$pids;
          fi
      fi;
      while umount  /mnt/lustre2 2>&1 | grep -q busy; do
          echo /mnt/lustre2 is still busy, wait one second && sleep 1;
      done;
      fi
      CMD: onyx-32vm7 grep -c /mnt/lustre-mds1' ' /proc/mounts
      CMD: onyx-32vm7 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm3 grep -c /mnt/lustre-mds2' ' /proc/mounts
      CMD: onyx-32vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm7 grep -c /mnt/lustre-mds3' ' /proc/mounts
      CMD: onyx-32vm7 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm3 grep -c /mnt/lustre-mds4' ' /proc/mounts
      CMD: onyx-32vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost1' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost2' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost3' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost4' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost5' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost6' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost7' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost8' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm2,onyx-32vm3,onyx-32vm7,onyx-32vm8 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_hostid 
      Loading modules from /usr/lib64/lustre
      detected 2 online CPUs by sysfs
      Force libcfs to create 2 CPU partitions
      debug=-1
      subsystem_debug=all -lnet -lnd -pinger
      Formatting mgs, mds, osts
      Format mds1: /dev/lvm-Role_MDS/P1
      CMD: onyx-32vm7 grep -c /mnt/lustre-mds1' ' /proc/mounts
      CMD: onyx-32vm7 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm7 mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_MDS/P1
      
         Permanent disk data:
      Target:     lustre:MDT0000
      Index:      0
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x65
                    (MDT MGS first_time update )
      Persistent mount opts: user_xattr,errors=remount-ro
      Parameters: sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/usr/sbin/l_getidentity
      
      device size = 2048MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_MDS/P1
      	target name   lustre:MDT0000
      	4k blocks     50000
      	options        -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0000  -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F /dev/lvm-Role_MDS/P1 50000
      Writing CONFIGS/mountdata
      Format mds2: /dev/lvm-Role_MDS/P2
      CMD: onyx-32vm3 grep -c /mnt/lustre-mds2' ' /proc/mounts
      CMD: onyx-32vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm3 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --mdt --index=1 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_MDS/P2
      
         Permanent disk data:
      Target:     lustre:MDT0001
      Index:      1
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x61
                    (MDT first_time update )
      Persistent mount opts: user_xattr,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/usr/sbin/l_getidentity
      
      device size = 2048MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_MDS/P2
      	target name   lustre:MDT0001
      	4k blocks     50000
      	options        -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0001  -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F /dev/lvm-Role_MDS/P2 50000
      Writing CONFIGS/mountdata
      Format mds3: /dev/lvm-Role_MDS/P3
      CMD: onyx-32vm7 grep -c /mnt/lustre-mds3' ' /proc/mounts
      CMD: onyx-32vm7 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm7 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --mdt --index=2 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_MDS/P3
      
         Permanent disk data:
      Target:     lustre:MDT0002
      Index:      2
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x61
                    (MDT first_time update )
      Persistent mount opts: user_xattr,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/usr/sbin/l_getidentity
      
      device size = 2048MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_MDS/P3
      	target name   lustre:MDT0002
      	4k blocks     50000
      	options        -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0002  -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F /dev/lvm-Role_MDS/P3 50000
      Writing CONFIGS/mountdata
      Format mds4: /dev/lvm-Role_MDS/P4
      CMD: onyx-32vm3 grep -c /mnt/lustre-mds4' ' /proc/mounts
      CMD: onyx-32vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm3 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --mdt --index=3 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_MDS/P4
      
         Permanent disk data:
      Target:     lustre:MDT0003
      Index:      3
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x61
                    (MDT first_time update )
      Persistent mount opts: user_xattr,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/usr/sbin/l_getidentity
      
      device size = 2048MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_MDS/P4
      	target name   lustre:MDT0003
      	4k blocks     50000
      	options        -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0003  -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_itable_init,lazy_journal_init -F /dev/lvm-Role_MDS/P4 50000
      Writing CONFIGS/mountdata
      Format ost1: /dev/lvm-Role_OSS/P1
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost1' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=0 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P1
      
         Permanent disk data:
      Target:     lustre:OST0000
      Index:      0
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x62
                    (OST first_time update )
      Persistent mount opts: ,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20
      
      device size = 9912MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P1
      	target name   lustre:OST0000
      	4k blocks     50000
      	options        -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0000  -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P1 50000
      Writing CONFIGS/mountdata
      Format ost2: /dev/lvm-Role_OSS/P2
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost2' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=1 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P2
      
         Permanent disk data:
      Target:     lustre:OST0001
      Index:      1
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x62
                    (OST first_time update )
      Persistent mount opts: ,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20
      
      device size = 9912MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P2
      	target name   lustre:OST0001
      	4k blocks     50000
      	options        -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0001  -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P2 50000
      Writing CONFIGS/mountdata
      Format ost3: /dev/lvm-Role_OSS/P3
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost3' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=2 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P3
      
         Permanent disk data:
      Target:     lustre:OST0002
      Index:      2
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x62
                    (OST first_time update )
      Persistent mount opts: ,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20
      
      device size = 9912MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P3
      	target name   lustre:OST0002
      	4k blocks     50000
      	options        -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0002  -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P3 50000
      Writing CONFIGS/mountdata
      Format ost4: /dev/lvm-Role_OSS/P4
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost4' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=3 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P4
      
         Permanent disk data:
      Target:     lustre:OST0003
      Index:      3
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x62
                    (OST first_time update )
      Persistent mount opts: ,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20
      
      device size = 9912MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P4
      	target name   lustre:OST0003
      	4k blocks     50000
      	options        -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0003  -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P4 50000
      Writing CONFIGS/mountdata
      Format ost5: /dev/lvm-Role_OSS/P5
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost5' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=4 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P5
      
         Permanent disk data:
      Target:     lustre:OST0004
      Index:      4
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x62
                    (OST first_time update )
      Persistent mount opts: ,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20
      
      device size = 9912MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P5
      	target name   lustre:OST0004
      	4k blocks     50000
      	options        -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0004  -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P5 50000
      Writing CONFIGS/mountdata
      Format ost6: /dev/lvm-Role_OSS/P6
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost6' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=5 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P6
      
         Permanent disk data:
      Target:     lustre:OST0005
      Index:      5
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x62
                    (OST first_time update )
      Persistent mount opts: ,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20
      
      device size = 9912MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P6
      	target name   lustre:OST0005
      	4k blocks     50000
      	options        -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0005  -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P6 50000
      Writing CONFIGS/mountdata
      Format ost7: /dev/lvm-Role_OSS/P7
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost7' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=6 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P7
      
         Permanent disk data:
      Target:     lustre:OST0006
      Index:      6
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x62
                    (OST first_time update )
      Persistent mount opts: ,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20
      
      device size = 9912MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P7
      	target name   lustre:OST0006
      	4k blocks     50000
      	options        -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0006  -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P7 50000
      Writing CONFIGS/mountdata
      Format ost8: /dev/lvm-Role_OSS/P8
      CMD: onyx-32vm8 grep -c /mnt/lustre-ost8' ' /proc/mounts
      CMD: onyx-32vm8 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: onyx-32vm8 mkfs.lustre --mgsnode=onyx-32vm7@tcp --fsname=lustre --ost --index=7 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat /dev/lvm-Role_OSS/P8
      
         Permanent disk data:
      Target:     lustre:OST0007
      Index:      7
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x62
                    (OST first_time update )
      Persistent mount opts: ,errors=remount-ro
      Parameters: mgsnode=10.2.4.117@tcp sys.timeout=20
      
      device size = 9912MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P8
      	target name   lustre:OST0007
      	4k blocks     50000
      	options        -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre:OST0007  -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize="4290772992",lazy_journal_init -F /dev/lvm-Role_OSS/P8 50000
      Writing CONFIGS/mountdata
      start mds service on onyx-32vm7
      CMD: onyx-32vm7 mkdir -p /mnt/lustre-mds1
      CMD: onyx-32vm7 test -b /dev/lvm-Role_MDS/P1
      CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P1
      Starting mds1:   /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1
      CMD: onyx-32vm7 mkdir -p /mnt/lustre-mds1; mount -t lustre   		                   /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1
      CMD: onyx-32vm7 /usr/sbin/lctl get_param -n health_check
      CMD: onyx-32vm7 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 4 
      CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P1 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P1 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      CMD: onyx-32vm7 sync; sync; sync
      CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P1 2>/dev/null
      Started lustre-MDT0000
      start ost1 service on onyx-32vm8
      CMD: onyx-32vm8 mkdir -p /mnt/lustre-ost1
      CMD: onyx-32vm8 test -b /dev/lvm-Role_OSS/P1
      CMD: onyx-32vm8 e2label /dev/lvm-Role_OSS/P1
      Starting ost1:   /dev/lvm-Role_OSS/P1 /mnt/lustre-ost1
      CMD: onyx-32vm8 mkdir -p /mnt/lustre-ost1; mount -t lustre   		                   /dev/lvm-Role_OSS/P1 /mnt/lustre-ost1
      CMD: onyx-32vm8 /usr/sbin/lctl get_param -n health_check
      CMD: onyx-32vm8 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 4 
      CMD: onyx-32vm8 e2label /dev/lvm-Role_OSS/P1 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      CMD: onyx-32vm8 e2label /dev/lvm-Role_OSS/P1 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      CMD: onyx-32vm8 sync; sync; sync
      CMD: onyx-32vm8 e2label /dev/lvm-Role_OSS/P1 2>/dev/null
      Started lustre-OST0000
      CMD: onyx-32vm7 /usr/sbin/lctl set_param fail_val = 10 fail_loc=0x8000090e
      onyx-32vm7: error: set_param: setting /proc/sys/lnet/fail_val==: Invalid argument
      onyx-32vm7: error: set_param: param_path '10': No such file or directory
      mount lustre on /mnt/lustre.....
      Starting client: onyx-32vm1.onyx.hpdd.intel.com:  -o user_xattr,flock onyx-32vm7@tcp:/lustre /mnt/lustre
      CMD: onyx-32vm1.onyx.hpdd.intel.com mkdir -p /mnt/lustre
      start mds service on onyx-32vm7
      start mds service on onyx-32vm3
      start mds service on onyx-32vm3
      CMD: onyx-32vm1.onyx.hpdd.intel.com mount -t lustre -o user_xattr,flock onyx-32vm7@tcp:/lustre /mnt/lustre
      CMD: onyx-32vm7 mkdir -p /mnt/lustre-mds3
      CMD: onyx-32vm3 mkdir -p /mnt/lustre-mds2
      CMD: onyx-32vm3 mkdir -p /mnt/lustre-mds4
      CMD: onyx-32vm3 test -b /dev/lvm-Role_MDS/P2
      CMD: onyx-32vm3 test -b /dev/lvm-Role_MDS/P4
      CMD: onyx-32vm7 test -b /dev/lvm-Role_MDS/P3
      CMD: onyx-32vm3 e2label /dev/lvm-Role_MDS/P4
      CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P3
      CMD: onyx-32vm3 e2label /dev/lvm-Role_MDS/P2
      Starting mds3:   /dev/lvm-Role_MDS/P3 /mnt/lustre-mds3
      CMD: onyx-32vm7 mkdir -p /mnt/lustre-mds3; mount -t lustre   		                   /dev/lvm-Role_MDS/P3 /mnt/lustre-mds3
      Starting mds4:   /dev/lvm-Role_MDS/P4 /mnt/lustre-mds4
      CMD: onyx-32vm3 mkdir -p /mnt/lustre-mds4; mount -t lustre   		                   /dev/lvm-Role_MDS/P4 /mnt/lustre-mds4
      Starting mds2:   /dev/lvm-Role_MDS/P2 /mnt/lustre-mds2
      CMD: onyx-32vm3 mkdir -p /mnt/lustre-mds2; mount -t lustre   		                   /dev/lvm-Role_MDS/P2 /mnt/lustre-mds2
      CMD: onyx-32vm7 /usr/sbin/lctl get_param -n health_check
      CMD: onyx-32vm7 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 4 
      CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P3 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P3 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      CMD: onyx-32vm7 sync; sync; sync
      CMD: onyx-32vm7 e2label /dev/lvm-Role_MDS/P3 2>/dev/null
      Started lustre-MDT0002
      CMD: onyx-32vm7 lctl list_param osc.lustre-OST*-osc             > /dev/null 2>&1
      CMD: onyx-32vm7 lctl get_param -n at_min
      CMD: onyx-32vm7 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh wait_import_state FULL osc.lustre-OST0000-osc-MDT0000.ost_server_uuid 40 
      onyx-32vm7: osc.lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 0 sec
      CMD: onyx-32vm3 /usr/sbin/lctl lustre_build_version
      pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused
      /usr/lib64/lustre/tests/test-framework.sh: line 382: ( << 16) | ( << 8) | : syntax error: operand expected (error token is "<< 16) | ( << 8) | ")
      /usr/lib64/lustre/tests/test-framework.sh: line 5818: [: -le: unary operator expected
      CMD: onyx-32vm3 /usr/sbin/lctl lustre_build_version
      pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused
      /usr/lib64/lustre/tests/test-framework.sh: line 382: ( << 16) | ( << 8) | : syntax error: operand expected (error token is "<< 16) | ( << 8) | ")
      /usr/lib64/lustre/tests/test-framework.sh: line 5803: [: -gt: unary operator expected
      CMD: onyx-32vm3 lctl get_param -n at_min
      pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused
      CMD: onyx-32vm3 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh wait_import_state FULL osc.lustre-OST0000-osc-MDT0001.ost_server_uuid 40 
      pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused
       conf-sanity test_93: @@@@@@ FAIL: import is not in FULL state 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:4785:error()
        = /usr/lib64/lustre/tests/test-framework.sh:5976:_wait_osc_import_state()
        = /usr/lib64/lustre/tests/test-framework.sh:5991:wait_osc_import_state()
        = /usr/lib64/lustre/tests/conf-sanity.sh:6467:test_93()
        = /usr/lib64/lustre/tests/test-framework.sh:5049:run_one()
        = /usr/lib64/lustre/tests/test-framework.sh:5088:run_one_logged()
        = /usr/lib64/lustre/tests/test-framework.sh:4935:run_test()
        = /usr/lib64/lustre/tests/conf-sanity.sh:6473:main()
      Dumping lctl log to /logdir/test_logs/2016-06-28/lustre-reviews-el7-x86_64--review-dne-part-1--1_6_1__40104__-69939819083780-054108/conf-sanity.test_93.*.1467152498.log
      CMD: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2,onyx-32vm3,onyx-32vm7,onyx-32vm8 /usr/sbin/lctl dk > /logdir/test_logs/2016-06-28/lustre-reviews-el7-x86_64--review-dne-part-1--1_6_1__40104__-69939819083780-054108/conf-sanity.test_93.debug_log.\$(hostname -s).1467152498.log;
               dmesg > /logdir/test_logs/2016-06-28/lustre-reviews-el7-x86_64--review-dne-part-1--1_6_1__40104__-69939819083780-054108/conf-sanity.test_93.dmesg.\$(hostname -s).1467152498.log
      pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused
      Resetting fail_loc on all nodes...CMD: onyx-32vm1.onyx.hpdd.intel.com,onyx-32vm2,onyx-32vm3,onyx-32vm7,onyx-32vm8 lctl set_param -n fail_loc=0 	    fail_val=0 2>/dev/null || true
      pdsh@onyx-32vm1: onyx-32vm3: mcmd: connect failed: Connection refused
      done.
      

      This occurs during conf-sanity/test_93 :

      test_93() {
              [ $MDSCOUNT -lt 3 ] && skip "needs >= 3 MDTs" && return
      
              reformat
              #start mgs or mgs/mdt0
              if ! combined_mgs_mds ; then
                      start_mgs
                      start_mdt 1
              else
                      start_mdt 1
              fi
      
              start_ost || error "OST0 start fail"
      
              #define OBD_FAIL_MGS_WRITE_TARGET_DELAY  0x90e
              do_facet mgs "$LCTL set_param fail_val = 10 fail_loc=0x8000090e"
              for num in $(seq 2 $MDSCOUNT); do
                      start_mdt $num &    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
              done
      
              mount_client $MOUNT || error "mount client fails"
              wait_osc_import_state mds ost FULL
              wait_osc_import_state client ost FULL
              check_mount || error "check_mount failed"
      
              cleanup || error "cleanup failed with $?"
      }
      run_test 93 "register mulitple MDT at the same time"
      

      and the reason of the failure is the following crash/LBUG found in onyx-32vm3/MDS Console log :

      15:21:36:[29395.748697] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4
      15:21:36:[29395.753612] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2
      15:21:36:[29396.019926] Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P4
      15:21:36:[29396.024718] Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P2
      15:21:36:[29396.306479] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P2
      15:21:36:[29396.311613] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P4
      15:21:36:[29396.577947] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre   		                   /dev/lvm-Role_MDS/P2 /mnt/lustre-mds2
      15:21:36:[29396.594860] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4; mount -t lustre   		                   /dev/lvm-Role_MDS/P4 /mnt/lustre-mds4
      15:21:36:[29396.743622] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: errors=remount-ro
      15:21:36:[29396.750879] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro
      15:21:36:[29396.909772] LustreError: 26347:0:(osd_handler.c:6468:osd_device_init0()) ASSERTION( info ) failed: 
      15:21:36:[29396.912150] LustreError: 26347:0:(osd_handler.c:6468:osd_device_init0()) LBUG
      15:21:36:[29396.915016] Pid: 26347, comm: mount.lustre
      15:21:36:[29396.919614] 
      15:21:36:[29396.919614] Call Trace:
      15:21:36:[29396.922958]  [<ffffffffa05e67d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
      15:21:36:[29396.925401]  [<ffffffffa05e6d75>] lbug_with_loc+0x45/0xc0 [libcfs]
      15:21:36:[29396.927434]  [<ffffffffa0c24ccf>] osd_device_alloc+0x70f/0x880 [osd_ldiskfs]
      15:21:36:[29396.929611]  [<ffffffffa07cd104>] obd_setup+0x114/0x2a0 [obdclass]
      15:21:36:[29396.931618]  [<ffffffffa07cfb54>] class_setup+0x2f4/0x8d0 [obdclass]
      15:21:36:[29396.933586]  [<ffffffffa07d3ee7>] class_process_config+0x1de7/0x2f70 [obdclass]
      15:21:36:[29396.935800]  [<ffffffffa05f1957>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      15:21:36:[29396.937938] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache
      15:21:36:[29396.937939]  [<ffffffffa07dcb69>] do_lcfg+0x159/0x5d0 [obdclass]
      15:21:36:[29396.937954]  [<ffffffffa07dd928>] lustre_start_simple+0x88/0x210 [obdclass]
      15:21:36:[29396.937972]  [<ffffffffa0808ac4>] server_fill_super+0xf24/0x184c [obdclass]
      15:21:36:[29396.937977]  [<ffffffffa05f1957>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      15:21:36:[29396.937991]  [<ffffffffa07e09e8>] lustre_fill_super+0x328/0x950 [obdclass]
      15:21:36:[29396.938013]  [<ffffffffa07e06c0>] ? lustre_fill_super+0x0/0x950 [obdclass]
      15:21:36:[29396.938019]  [<ffffffff811e1f2d>] mount_nodev+0x4d/0xb0
      15:21:36:[29396.938033]  [<ffffffffa07d8918>] lustre_mount+0x38/0x60 [obdclass]
      15:21:36:[29396.938034]  [<ffffffff811e28d9>] mount_fs+0x39/0x1b0
      15:21:36:[29396.938038]  [<ffffffff811fe1af>] vfs_kern_mount+0x5f/0xf0
      15:21:36:[29396.938039]  [<ffffffff812006fe>] do_mount+0x24e/0xa40
      15:21:36:[29396.938043]  [<ffffffff8116e15e>] ? __get_free_pages+0xe/0x50
      15:21:36:[29396.938044]  [<ffffffff81200f86>] SyS_mount+0x96/0xf0
      15:21:36:[29396.938048]  [<ffffffff816463c9>] system_call_fastpath+0x16/0x1b
      15:21:36:[29396.938048] 
      15:21:36:[29396.968797] Kernel panic - not syncing: LBUG
      15:21:36:[29396.969781] CPU: 0 PID: 26347 Comm: mount.lustre Tainted: G           OE  ------------   3.10.0-327.18.2.el7_lustre.x86_64 #1
      15:21:36:[29396.969781] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      15:21:36:[29396.969781]  ffffffffa0603def 0000000048fb9a4f ffff880039073950 ffffffff81635c14
      15:21:36:[29396.969781]  ffff8800390739d0 ffffffff8162f48a ffffffff00000008 ffff8800390739e0
      15:21:36:[29396.969781]  ffff880039073980 0000000048fb9a4f ffffffffa0c511a0 0000000000000246
      15:21:36:[29396.969781] Call Trace:
      15:21:36:[29396.969781]  [<ffffffff81635c14>] dump_stack+0x19/0x1b
      15:21:36:[29396.969781]  [<ffffffff8162f48a>] panic+0xd8/0x1e7
      15:21:36:[29396.969781]  [<ffffffffa05e6ddb>] lbug_with_loc+0xab/0xc0 [libcfs]
      15:21:36:[29396.969781]  [<ffffffffa0c24ccf>] osd_device_alloc+0x70f/0x880 [osd_ldiskfs]
      15:21:36:[29396.969781]  [<ffffffffa07cd104>] obd_setup+0x114/0x2a0 [obdclass]
      15:21:36:[29396.969781]  [<ffffffffa07cfb54>] class_setup+0x2f4/0x8d0 [obdclass]
      15:21:36:[29396.969781]  [<ffffffffa07d3ee7>] class_process_config+0x1de7/0x2f70 [obdclass]
      15:21:36:[29396.969781]  [<ffffffffa05f1957>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      15:21:36:[29396.969781]  [<ffffffffa07dcb69>] do_lcfg+0x159/0x5d0 [obdclass]
      15:21:36:[29396.969781]  [<ffffffffa07dd928>] lustre_start_simple+0x88/0x210 [obdclass]
      15:21:36:[29396.969781]  [<ffffffffa0808ac4>] server_fill_super+0xf24/0x184c [obdclass]
      15:21:36:[29396.969781]  [<ffffffffa05f1957>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      15:21:36:[29396.969781]  [<ffffffffa07e09e8>] lustre_fill_super+0x328/0x950 [obdclass]
      15:21:36:[29396.969781]  [<ffffffffa07e06c0>] ? lustre_common_put_super+0x270/0x270 [obdclass]
      15:21:36:[29396.969781]  [<ffffffff811e1f2d>] mount_nodev+0x4d/0xb0
      15:21:36:[29396.969781]  [<ffffffffa07d8918>] lustre_mount+0x38/0x60 [obdclass]
      15:21:36:[29396.969781]  [<ffffffff811e28d9>] mount_fs+0x39/0x1b0
      15:21:36:[29396.969781]  [<ffffffff811fe1af>] vfs_kern_mount+0x5f/0xf0
      15:21:36:[29396.969781]  [<ffffffff812006fe>] do_mount+0x24e/0xa40
      15:21:36:[29396.969781]  [<ffffffff8116e15e>] ? __get_free_pages+0xe/0x50
      15:21:36:[29396.969781]  [<ffffffff81200f86>] SyS_mount+0x96/0xf0
      15:21:36:[29396.969781]  [<ffffffff816463c9>] system_call_fastpath+0x16/0x1b
      

      Info required for matching: conf-sanity 93

      Attachments

        Issue Links

          Activity

            [LU-8346] conf-sanity test_93: test failed to respond and timed out
            eaujames Etienne Aujames added a comment - - edited

            @Antoine Percher I have created the LU-14110 to follow your issue. This affects also the master branch (with less recurrences).

            eaujames Etienne Aujames added a comment - - edited @Antoine Percher I have created the LU-14110 to follow your issue. This affects also the master branch (with less recurrences).

            It seems that the patch 26099 has bad effect on parallel mounts on lustre client

            On a client node with lustre 2.12.5 after mounting the same filesystem twice
            in parallel then unmounting these filesystem, it is impossible to remove the
            lustre module from the kernel

            fstab:
            <serv1@ib1>:<serv2@ib1>:/fs1 /mnt/fs1 lustre defaults,_netdev,noauto,x-systemd.requires=lnet.service,flock,user_xattr,nosuid 0 0
            <serv1@ib1>:<serv2@ib1>:/fs1/home /mnt/home lustre defaults,_netdev,noauto,x-systemd.requires=lnet.service,flock,user_xattr,nosuid 0 0

            {{ systemctl start lnet
            modprobe lustre
            mount /mnt/home & mount /mnt/fs1
            umount /mnt/home
            umount /mnt/fs1
            rmmod lustre <- hang}}

            The rmmod stack in kernel is

            {{#0 __schedule
            #1 schedule
            #2 lu_contex_key_degister [obdclass]
            #3 lu_context_key_degister_many [obdclass]
            #4 vvp_global_fini [lustre]
            #5 lustre_exit [lustre]
            #6 __x64_sys_delete_module
            #7 do_syscall
            #8 entry_SYSCALL_64_after_hwframe
            }}
            crash> p vvp_thread_key.lct_used.counter
            $1 = 105
            crash> p vvp_session_key.lct_used.counter
            $2 = 51

            apercher Antoine Percher added a comment - It seems that the patch 26099 has bad effect on parallel mounts on lustre client On a client node with lustre 2.12.5 after mounting the same filesystem twice in parallel then unmounting these filesystem, it is impossible to remove the lustre module from the kernel fstab: <serv1@ib1>:<serv2@ib1>:/fs1 /mnt/fs1 lustre defaults,_netdev,noauto,x-systemd.requires=lnet.service,flock,user_xattr,nosuid 0 0 <serv1@ib1>:<serv2@ib1>:/fs1/home /mnt/home lustre defaults,_netdev,noauto,x-systemd.requires=lnet.service,flock,user_xattr,nosuid 0 0 {{ systemctl start lnet modprobe lustre mount /mnt/home & mount /mnt/fs1 umount /mnt/home umount /mnt/fs1 rmmod lustre <- hang}} The rmmod stack in kernel is {{#0 __schedule #1 schedule #2 lu_contex_key_degister [obdclass] #3 lu_context_key_degister_many [obdclass] #4 vvp_global_fini [lustre] #5 lustre_exit [lustre] #6 __x64_sys_delete_module #7 do_syscall #8 entry_SYSCALL_64_after_hwframe }} crash> p vvp_thread_key.lct_used.counter $1 = 105 crash> p vvp_session_key.lct_used.counter $2 = 51

            I searched back to the start of the year, and there were two timeouts for this test in the past 3 months, so this isn't really a high priority to fix:
            2020-03-02 https://testing.whamcloud.com/test_sets/ddf5643a-c7b9-4b0b-9b86-2adfe74817d3
            2020-02-24 https://testing.whamcloud.com/test_sets/b0e66f0e-c3d4-49e9-9024-9e7910dd3d12

            adilger Andreas Dilger added a comment - I searched back to the start of the year, and there were two timeouts for this test in the past 3 months, so this isn't really a high priority to fix: 2020-03-02 https://testing.whamcloud.com/test_sets/ddf5643a-c7b9-4b0b-9b86-2adfe74817d3 2020-02-24 https://testing.whamcloud.com/test_sets/b0e66f0e-c3d4-49e9-9024-9e7910dd3d12

            Is this work done.

            simmonsja James A Simmons added a comment - Is this work done.

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34226/
            Subject: LU-8346 tests: remove spaces around fail_val
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 430b20be17645989a51fb586824f7637535ff24e

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34226/ Subject: LU-8346 tests: remove spaces around fail_val Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 430b20be17645989a51fb586824f7637535ff24e

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34155/
            Subject: LU-8346 tests: remove spaces around fail_val
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 59cb4a5c39e2c85a89be2863a73899c02c9a89c3

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34155/ Subject: LU-8346 tests: remove spaces around fail_val Project: fs/lustre-release Branch: master Current Patch Set: Commit: 59cb4a5c39e2c85a89be2863a73899c02c9a89c3

            James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34226
            Subject: LU-8346 tests: remove spaces around fail_val
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: dbb3a02fef7332b126668bb5b2d3066d77243f90

            gerrit Gerrit Updater added a comment - James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34226 Subject: LU-8346 tests: remove spaces around fail_val Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: dbb3a02fef7332b126668bb5b2d3066d77243f90
            jamesanunez James Nunez (Inactive) added a comment - - edited

            The patch https://review.whamcloud.com/34155 fixes the problem with setting fail_val in conf-sanity test 93:
            do_facet mgs "$LCTL set_param fail_val = 10 fail_loc=0x8000090e"

            jamesanunez James Nunez (Inactive) added a comment - - edited The patch https://review.whamcloud.com/34155 fixes the problem with setting fail_val in conf-sanity test 93: do_facet mgs "$LCTL set_param fail_val = 10 fail_loc=0x8000090e"

            James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34155
            Subject: LU-8346 tests: remove spaces around fail_val
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8d2909dab2e8af0d3301db14dec175a498d5f63b

            gerrit Gerrit Updater added a comment - James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34155 Subject: LU-8346 tests: remove spaces around fail_val Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8d2909dab2e8af0d3301db14dec175a498d5f63b

            Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: https://review.whamcloud.com/31971
            Subject: LU-8346 osd-ldiskfs: don't assert if module is going
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: b6bbcddb2dec31dc6019fc8177c35424a957ed22

            gerrit Gerrit Updater added a comment - Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: https://review.whamcloud.com/31971 Subject: LU-8346 osd-ldiskfs: don't assert if module is going Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: b6bbcddb2dec31dc6019fc8177c35424a957ed22

            People

              hongchao.zhang Hongchao Zhang
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: