Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Lai Siyao <lai.siyao@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/1bf6a37a-7821-11e9-a028-52540065bddc
CMD: trevis-38vm8 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests//usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/compat-openmpi16/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/sbin:/sbin:/bin::/sbin:/sbin:/bin:/usr/sbin: NAME=autotest_config bash rpc.sh set_default_debug \"vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck\" \"all\" 4 trevis-38vm8: == rpc test complete, duration -o sec ================================================================ 19:32:45 (1558035165) trevis-38vm8: trevis-38vm8.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4 CMD: trevis-38vm8 e2label /dev/mapper/ost8_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' CMD: trevis-38vm8 e2label /dev/mapper/ost8_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' CMD: trevis-38vm8 e2label /dev/mapper/ost8_flakey 2>/dev/null Started lustre-OST0007 CMD: trevis-38vm9 /usr/sbin/lctl list_nids | grep tcp999 Starting client: trevis-38vm6.trevis.whamcloud.com: -o user_xattr,flock,network=tcp999 10.9.3.145@tcp999:/lustre /mnt/lustre CMD: trevis-38vm6.trevis.whamcloud.com mkdir -p /mnt/lustre CMD: trevis-38vm6.trevis.whamcloud.com mount -t lustre -o user_xattr,flock,network=tcp999 10.9.3.145@tcp999:/lustre /mnt/lustre mount.lustre: mount 10.9.3.145@tcp999:/lustre at /mnt/lustre failed: Invalid argument This may have multiple causes. Is 'lustre' the correct filesystem name? Are the mount options correct? Check the syslog for more info. unconfigure: - lnet: errno: -16 descr: "LNet unconfigure error: Device or resource busy"
[17996.736209] Lustre: DEBUG MARKER: == sanity-sec test 31: client mount option '-o network' ============================================== 19:30:04 (1558035004) [17997.693592] Lustre: DEBUG MARKER: lctl get_param -n *.lustre*.exports.'10.9.5.215@tcp'.uuid 2>/dev/null | grep -q - [17998.217952] Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure && /usr/sbin/lnetctl net add --if eth0 --net tcp999 [17998.557153] LNet: Added LNI 10.9.3.146@tcp999 [8/256/0/180] [18000.237970] LustreError: 11-0: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 10.9.3.145@tcp failed: rc = -107 [18000.239925] LustreError: Skipped 9 previous similar messages [18000.240888] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 10.9.3.145@tcp) was lost; in progress operations using this service will wait for recovery to complete [18000.243842] Lustre: Skipped 18 previous similar messages [18007.616779] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true [18007.922846] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2 [18009.125483] Lustre: lustre-MDT0001: Not available for connect from 10.9.3.145@tcp (stopping) [18009.127096] Lustre: Skipped 42 previous similar messages [18011.260546] LustreError: 17495:0:(client.c:1183:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff98fcd0168d80 x1633699486628912/t0(0) o41->lustre-MDT0003-osp-MDT0001@0@lo:24/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 [18011.264135] LustreError: 17495:0:(client.c:1183:ptlrpc_import_delay_req()) Skipped 2 previous similar messages [18015.357668] Lustre: server umount lustre-MDT0001 complete [18015.358716] Lustre: Skipped 1 previous similar message [18016.092394] LustreError: 137-5: lustre-MDT0001_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity test_24v - Timeout occurred after 343 mins, last suite running was sanity-sec, restarting cluster to continue tests
Attachments
Issue Links
- is blocking
-
LU-9667 LNet Kernel/Userspace Interface
- Open
- is related to
-
LU-12688 sanity-sec test 31 fails with 'unable to configure NID o2ib999'
- Resolved
-
LU-13028 LNet Discovery: toggling discovery on/off is not handled properly
- Resolved
- is related to
-
LU-15675 Interop sanity-sec test_27a: fileset not taken into account
- Resolved