Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.14.0
-
None
-
3
-
9223372036854775807
Description
Filing this as a bug, but it might be user-error. The problem I'm seeing is that clients are only attempting to connect to one of two available NIDs on the servers. Both clients and servers are configured with two NIDs, one on @tcp and on one @tcp1. Discovery/multi-rail is disabled. If I simulate failure of the @tcp NID on the OSS then I would expect the client to eventually try the @tcp1 NID, but that never happens. I tried modifying test-framework.sh so that it formats the OST with --servicenode=192.168.2.34@tcp,192.168.2.35@tcp1, but that didn't make any difference. Is this working as expected? Is this a bug? Am I missing some necessary config to allow client to use either NID?
Edit 1: There was a suggestion to put the interfaces on separate subnets. I tried that and it did not resolve the issue. See this comment.
Edit 2: Attached -1 debug log from the client client.dklog
. Note this debug log was captured after I changed the configuration to put the interfaces on separate networks as noted in this comment.
Details on how to reproduce follow.
Version under test:
sles15s01:/home/hornc/fs2 # ./LUSTRE-VERSION-GEN 2.13.57_71_gb538826 sles15s01:/home/hornc/fs2 #
LNet configuration is tcp(eth0) and tcp1(eth1) with LNet peer discovery disabled:
sles15s01:/home/hornc/fs2/lustre/tests # cat /etc/modprobe.d/lustre.conf options libcfs libcfs_debug=320735104 options libcfs libcfs_subsystem_debug=-2049 options lnet lnet_peer_discovery_disabled=1 options lnet ip2nets="tcp(eth0) 192.168.2.[30,32,34,36,38,39,40,41]; tcp1(eth1) 192.168.2.[31,33,35,37,42,43,44,45]" sles15s01:/home/hornc/fs2/lustre/tests #
Test-framework config:
sles15s01:/home/hornc/fs2/lustre/tests # cat cfg/hornc.sh # facet hosts MDSCOUNT=1 mds_HOST=sles15s01 MDSDEV1=/dev/sdc OSTCOUNT=1 ost_HOST=sles15s03 OSTDEV1=/dev/sde CLIENTCOUNT=1 RCLIENTS="sles15c01" PDSH="pdsh -S -Rssh -w" SHARED_DIRECTORY="/shared/testing" MGSNID="192.168.2.30@tcp,192.168.2.31@tcp1" . /home/hornc/fs2/lustre/tests/cfg/ncli.sh sles15s01:/home/hornc/fs2/lustre/tests #
Use llmount.sh to stand up filesystem:
sles15s01:/home/hornc/fs2/lustre/tests # NAME=hornc LOAD_MODULES_REMOTE=true VERBOSE=true /home/hornc/fs2/lustre/tests/llmount.sh
...
sles15s01:/home/hornc/fs2/lustre/tests # pdsh -w sles15s0[1,3],sles15c01 lctl list_nids | dshbak -c
----------------
sles15s01
----------------
192.168.2.30@tcp
192.168.2.31@tcp1
----------------
sles15c01
----------------
192.168.2.38@tcp
192.168.2.42@tcp1
----------------
sles15s03
----------------
192.168.2.34@tcp
192.168.2.35@tcp1
sles15s01:/home/hornc/fs2 # pdsh -w sles15s0[1,3],sles15c01 'lnetctl global show' | dshbak -c
----------------
sles15c01,sles15s[01,03]
----------------
global:
numa_range: 0
max_intf: 200
discovery: 0
drop_asym_route: 0
retry_count: 2
transaction_timeout: 50
health_sensitivity: 100
recovery_interval: 1
router_sensitivity: 100
lnd_timeout: 16
response_tracking: 3
sles15s01:/home/hornc/fs2 #
Unmount lustre from the client then re-mount with a drop rule in place, so that any traffic the client sends to eth0 on the OSS (a.k.a. 192.168.2.34@tcp) is dropped:
sles15c01:~ # umount /mnt/lustre sles15c01:~ # lctl net down; lustre_rmmod LNET busy sles15c01:~ # tf_start.sh Loading modules from /home/hornc/fs2/lustre detected 1 online CPUs by sysfs libcfs will create CPU partition based on online CPUs ../libcfs/libcfs/libcfs options: 'libcfs_debug=320735104 libcfs_subsystem_debug=-2049' ../lnet/lnet/lnet options: 'lnet_peer_discovery_disabled=1 ip2nets="tcp(eth0) 192.168.2.[30,32,34,36,38,39,40,41]; tcp1(eth1) 192.168.2.[31,33,35,37,42,43,44,45]" accept=all' quota/lquota options: 'hash_lqs_cur_bits=3' sles15c01:~ # lctl set_param debug=+'net rpctrace' debug=+net rpctrace sles15c01:~ # lctl get_param debug debug= super ioctl neterror net warning dlmtrace error emerg ha rpctrace vfstrace config console lfsck sles15c01:~ # lctl net_drop_add -s *@tcp -d 192.168.2.34@tcp -r 1 -e remote_timeout Added drop rule 255.255.255.255@tcp->192.168.2.34@tcp (1/1) sles15c01:~ # mount -t lustre -o user_xattr,flock 192.168.2.30@tcp,192.168.2.31@tcp1:/lustre /mnt/lustre sles15c01:~ # lfs check servers lfs check: error: check 'lustre-OST0000-osc-ffff9c00a8b70000': Resource temporarily unavailable (11) lustre-MDT0000-mdc-ffff9c00a8b70000 active. sles15c01:~ #
Wait a couple minutes and dump the log:
sles15c01:~ # sleep 120; lfs check servers; lctl dk > /tmp/dk.log lfs check: error: check 'lustre-OST0000-osc-ffff9c00a8b70000': Resource temporarily unavailable (11) lustre-MDT0000-mdc-ffff9c00a8b70000 active. sles15c01:~ #
With net trace enabled, LNet logs every send that it performs. Those entries look like this:
00000400:00000200:0.0:1611862288.864372:0:11737:0:(lib-move.c:1833:lnet_handle_send()) TRACE: 192.168.2.38@tcp(192.168.2.38@tcp:<?>) -> 192.168.2.34@tcp(192.168.2.34@tcp:192.168.2.34@tcp) <?> : GET try# 0
So if LNet sent any message to an @tcp1 NID then we should have record of it in the log.
We can see in the debug log that we haven't sent any messages to any tcp1 NIDs:
sles15c01:~ # grep tcp1 /tmp/dk.log 00000400:02000000:0.0:1611862138.860810:0:11727:0:(api-ni.c:2336:lnet_startup_lndni()) Added LNI 192.168.2.42@tcp1 [8/256/0/180] 00000020:01000004:0.0:1611862159.890388:0:12205:0:(obd_mount.c:968:lmd_print()) device: 192.168.2.30@tcp,192.168.2.31@tcp1:/lustre 00000020:00000080:0.0:1611862159.890446:0:12205:0:(obd_config.c:1383:class_process_config()) adding mapping from uuid MGC192.168.2.30@tcp_0 to nid 0x20001c0a8021f (192.168.2.31@tcp1) 00000020:00000080:0.0:1611862159.900401:0:12218:0:(obd_config.c:1383:class_process_config()) adding mapping from uuid 192.168.2.30@tcp to nid 0x20001c0a8021f (192.168.2.31@tcp1) 00000020:01000004:0.0:1611862159.903289:0:12218:0:(obd_mount.c:1004:lustre_check_exclusion()) Check exclusion lustre-OST0000 (0) in 0 of 192.168.2.30@tcp,192.168.2.31@tcp1:/lustre 00000020:00000080:0.0:1611862159.903298:0:12218:0:(obd_config.c:1383:class_process_config()) adding mapping from uuid 192.168.2.34@tcp to nid 0x20001c0a80223 (192.168.2.35@tcp1) 00000020:00000004:0.0:1611862160.910881:0:12205:0:(obd_mount.c:1683:lustre_fill_super()) Mount 192.168.2.30@tcp,192.168.2.31@tcp1:/lustre complete sles15c01:~ # grep 192.168.2.35 /tmp/dk.log 00000020:00000080:0.0:1611862159.903298:0:12218:0:(obd_config.c:1383:class_process_config()) adding mapping from uuid 192.168.2.34@tcp to nid 0x20001c0a80223 (192.168.2.35@tcp1) sles15c01:~ #
Statistics on the OSS confirm no traffic on tcp1:
sles15s03:~ # lnetctl net show --net tcp1 -v 2 | egrep -e send_count -e recv_count
send_count: 0
recv_count: 0
sles15s03:~ #