[LU-4475] mount command errors: "Communicating with 0@lo, operation mds_connect failed with -11" AND "Transport endpoint is not connected" Created: 11/Jan/14 Updated: 09/Jun/16 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.2 |
| Fix Version/s: | None |
| Type: | Story | Priority: | Minor |
| Reporter: | Mark Duffield | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | llnl | ||
| Environment: |
|
||
| Issue Links: |
|
||||
| Rank (Obsolete): | 12257 | ||||
| Description |
|
I created the mgs/mdt: mkfs.lustre --fsname=lfs1 --mgs --mdt --index=0 /dev/vg_root/es0-00 and the ost, on another node: mkfs.lustre --fsname=lfs1 --mgsnode=172.18.54.21@tcp0 --ost --index=0 /dev/vg_root/es2-00 When mounting either I receive a comm error. When mounting the ost I see "Transport endpoint is not connected": # mount -vvv -t lustre /dev/dm-3 /mnt/ost0 mount: fstab path: "/etc/fstab" mount: mtab path: "/etc/mtab" mount: lock path: "/etc/mtab~" mount: temp path: "/etc/mtab.tmp" mount: UID: 0 mount: eUID: 0 mount: spec: "/dev/mapper/vg_root-es2--00" mount: node: "/mnt/ost0" mount: types: "lustre" mount: opts: "(null)" final mount options: '(null)' mount: external mount: argv[0] = "/sbin/mount.lustre" mount: external mount: argv[1] = "/dev/mapper/vg_root-es2--00" mount: external mount: argv[2] = "/mnt/ost0" mount: external mount: argv[3] = "-v" mount: external mount: argv[4] = "-o" mount: external mount: argv[5] = "rw" arg[0] = /sbin/mount.lustre arg[1] = -v arg[2] = -o arg[3] = rw arg[4] = /dev/mapper/vg_root-es2--00 arg[5] = /mnt/ost0 source = /dev/mapper/vg_root-es2--00 (/dev/mapper/vg_root-es2--00), target = /mnt/ost0 options = rw checking for existing Lustre data: found Reading CONFIGS/mountdata mounting device /dev/mapper/vg_root-es2--00 at /mnt/ost0, flags=0x1000000 options=osd=osd-ldiskfs,errors=remount-ro,mgsnode=172.18.54.21@tcp,virgin,param=mgsnode=172.18.54.21@tcp,svname=lfs1-OST0000,device=/dev/mapper/vg_root-es2--00 mount.lustre: mount /dev/mapper/vg_root-es2--00 at /mnt/ost0 failed: Transport endpoint is not connected retries left: 0 mount.lustre: mount /dev/mapper/vg_root-es2--00 at /mnt/ost0 failed: Transport endpoint is not connected And when mounting the mgs/mdt I see "Communicating with 0@lo, operation mds_connect failed with -11": Jan 11 11:14:30 es0 kernel: LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. Opts: Jan 11 11:14:30 es0 kernel: Lustre: lfs1-MDT0000: used disk, loading Jan 11 11:14:30 es0 kernel: LustreError: 11-0: lfs1-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. The communication looks fine between nodes: From es0: [root@es0 log]# lctl lctl > ping es2 12345-0@lo 12345-172.18.54.23@tcp From es2: [root@es2 log]# lctl lctl > ping es0 12345-0@lo 12345-172.18.54.21@tcp |
| Comments |
| Comment by Brian Behlendorf [ 05/Jun/14 ] |
|
I'm able consistently reproduce this with 2.4.2 and just the llmount.sh script. I haven't had a chance yet to investigate further. FSTYPE=zfs /usr/lib64/lustre/tests/llmount.sh dmesg output Lustre: Lustre: Build Version: 2.4.2-7behlendorf-7behlendorf-1-PRISTINE-2.6.32-431.17.1.el6.x86_64 LNet: Added LNI 192.168.2.117@tcp [8/256/0/180] LNet: Accept secure, port 988 Lustre: Echo OBD driver; http://www.lustre.org/ Lustre: Setting parameter lustre-MDT0000-mdtlov.lov.stripesize in log lustre-MDT0000 Lustre: Setting parameter lustre-MDT0000-mdtlov.lov.stripecount in log lustre-MDT0000 Lustre: Skipped 1 previous similar message Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space Lustre: lustre-MDT0000: Initializing new disk LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. SELinux: (dev lustre, type lustre) has no xattr support Lustre: Failing over lustre-MDT0000 Lustre: server umount lustre-MDT0000 complete Lustre: Setting parameter lustre-MDT0000-mdtlov.lov.stripesize in log lustre-MDT0000 Lustre: Skipped 2 previous similar messages Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space Lustre: Skipped 1 previous similar message Lustre: srv-lustre-MDT0000: No data found on store. Initialize space Lustre: lustre-MDT0000: Initializing new disk LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. SELinux: (dev lustre, type lustre) has no xattr support Lustre: Failing over lustre-MDT0000 Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre' ' /proc/mounts); if [ $running -ne 0 ] ; then echo Stopping client $(hostname) /mnt/lustre opts:; lsof /mnt/lustre || need_kill=no; if [ x != x -a x$need_kill != xno ]; then pids=$(lsof -t /mnt/lustre | sort -u); if [ -n "$p Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre2' ' /proc/mounts); if [ $running -ne 0 ] ; then echo Stopping client $(hostname) /mnt/lustre2 opts:; lsof /mnt/lustre2 || need_kill=no; if [ x != x -a x$need_kill != xno ]; then pids=$(lsof -t /mnt/lustre2 | sort -u); if [ -n Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || grep -q ^lustre-mdt1/ /proc/mounts || zpool export lustre-mdt1 Lustre: DEBUG MARKER: grep -c /mnt/ost1' ' /proc/mounts Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' Lustre: DEBUG MARKER: ! zpool list -H lustre-ost1 >/dev/null 2>&1 || grep -q ^lustre-ost1/ /proc/mounts || zpool export lustre-ost1 Lustre: DEBUG MARKER: grep -c /mnt/ost2' ' /proc/mounts Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' Lustre: DEBUG MARKER: ! zpool list -H lustre-ost2 >/dev/null 2>&1 || grep -q ^lustre-ost2/ /proc/mounts || zpool export lustre-ost2 Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || grep -q ^lustre-mdt1/ /proc/mounts || zpool export lustre-mdt1 Lustre: DEBUG MARKER: mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=zfs --device-size=200000 --reformat lustre-mdt1/mdt1 /tmp/lu Lustre: DEBUG MARKER: zpool set cachefile=none lustre-mdt1 Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || grep -q ^lustre-mdt1/ /proc/mounts || zpool export lustre-mdt1 Lustre: DEBUG MARKER: grep -c /mnt/ost1' ' /proc/mounts Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' Lustre: DEBUG MARKER: ! zpool list -H lustre-ost1 >/dev/null 2>&1 || grep -q ^lustre-ost1/ /proc/mounts || zpool export lustre-ost1 Lustre: DEBUG MARKER: mkfs.lustre --mgsnode=ovirt-guest-241@tcp --fsname=lustre --ost --index=0 --param=sys.timeout=20 --backfstype=zfs --device-size=200000 --reformat lustre-ost1/ost1 /tmp/lustre-ost1 Lustre: DEBUG MARKER: zpool set cachefile=none lustre-ost1 Lustre: DEBUG MARKER: ! zpool list -H lustre-ost1 >/dev/null 2>&1 || grep -q ^lustre-ost1/ /proc/mounts || zpool export lustre-ost1 Lustre: DEBUG MARKER: grep -c /mnt/ost2' ' /proc/mounts Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' Lustre: DEBUG MARKER: ! zpool list -H lustre-ost2 >/dev/null 2>&1 || grep -q ^lustre-ost2/ /proc/mounts || zpool export lustre-ost2 Lustre: DEBUG MARKER: mkfs.lustre --mgsnode=ovirt-guest-241@tcp --fsname=lustre --ost --index=1 --param=sys.timeout=20 --backfstype=zfs --device-size=200000 --reformat lustre-ost2/ost2 /tmp/lustre-ost2 Lustre: DEBUG MARKER: zpool set cachefile=none lustre-ost2 Lustre: DEBUG MARKER: ! zpool list -H lustre-ost2 >/dev/null 2>&1 || grep -q ^lustre-ost2/ /proc/mounts || zpool export lustre-ost2 Lustre: DEBUG MARKER: running=$(grep -c /mnt/ost1' ' /proc/mounts); mpts=$(mount | grep -c /mnt/ost1' '); if [ $running -ne $mpts ]; then echo $(hostname) env are INSANE!; exit 1; fi Lustre: DEBUG MARKER: running=$(grep -c /mnt/ost2' ' /proc/mounts); mpts=$(mount | grep -c /mnt/ost2' '); if [ $running -ne $mpts ]; then echo $(hostname) env are INSANE!; exit 1; fi Lustre: DEBUG MARKER: running=$(grep -c /mnt/mds1' ' /proc/mounts); mpts=$(mount | grep -c /mnt/mds1' '); if [ $running -ne $mpts ]; then echo $(hostname) env are INSANE!; exit 1; fi Lustre: DEBUG MARKER: running=$(grep -c /mnt/mds1' ' /proc/mounts); mpts=$(mount | grep -c /mnt/mds1' '); if [ $running -ne $mpts ]; then echo $(hostname) env are INSANE!; exit 1; fi Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre' ' /proc/mounts); mpts=$(mount | grep -c /mnt/lustre' '); if [ $running -ne $mpts ]; then echo $(hostname) env are INSANE!; exit 1; fi Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre2' ' /proc/mounts); mpts=$(mount | grep -c /mnt/lustre2' '); if [ $running -ne $mpts ]; then echo $(hostname) env are INSANE!; exit 1; fi Lustre: DEBUG MARKER: mkdir -p /mnt/mds1 Lustre: DEBUG MARKER: zpool list -H lustre-mdt1 >/dev/null 2>&1 || zpool import -f -o cachefile=none -d /tmp lustre-mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre lustre-mdt1/mdt1 /mnt/mds1 Lustre: Setting parameter lustre-MDT0000-mdtlov.lov.stripesize in log lustre-MDT0000 Lustre: Skipped 4 previous similar messages Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space Lustre: lustre-MDT0000: Initializing new disk LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. SELinux: (dev lustre, type lustre) has no xattr support Lustre: Failing over lustre-MDT0000 Lustre: server umount lustre-MDT0000 complete |