[LU-14454] LNET routers added - then access issues with Lustre storage Created: 19/Feb/21  Updated: 05/May/21

Status: Reopened
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.3, Lustre 2.12.4
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Michael Ethier (Inactive) Assignee: Serguei Smirnov
Resolution: Unresolved Votes: 0
Labels: None
Environment:

All Centos 7.x. Hardware is either Dell or Lenovo. IB infrastructure is EDR IB with a MSB7800 switch. MLNX OFED is 4.7-1.0.0.1 for lnet routers


Attachments: Text File log.txt     Text File log1.txt    
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

I built 2 new LNET routers and added them to our LNET env. The version of software of OS/LNET/MLNX OFED is exactly the same as 2 other existing lnet routers in this location. I added lnet routes on the 2 Lustre filesystem we have in this physical location to point to the 2 new lnet routers. I tested one client in another data center we have by adding the 2 lnet routes on the client to point to the new lnet routers. The client could read and write fine. The next day we were having issues from various clients with access to the 2 Lustre FS I had set LNET routes on previously. We ended up removing all the lnet routes to the 2 new lnet routers on the Lustre filesystems and things started to working again. So we ended up removing the 2 new lnet routers from our LNET env.

LNET routers are running lnet 2.12.4, Lustre FS are lustre 2.12.3 and a very old version

We have not experienced this before and was wondering it there is a specific procedure we have to follow to add new lnet routers in our environment ?

The messages we were seeing on the lustre FS were for example:
Feb 19 09:03:17 boslfs02mds01 kernel: LNetError: 6413:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 9 previous similar messages
Feb 19 09:14:32 boslfs02mds01 kernel: LNetError: 6413:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.242.46.216@o2ib1 added to recovery queue. Health = 900
Feb 19 09:14:32 boslfs02mds01 kernel: LNetError: 6413:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 9 previous similar messages
Feb 19 09:25:47 boslfs02mds01 kernel: LNetError: 6413:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.242.46.217@o2ib1 added to recovery queue. Health = 900

We were getting messages like the above for all 4 of the lnet routers, both the existing and the 2 new ones that were added.

Also the hardware configuration of the 2 new LNET router is different. They have a dual port ConnectX-4 card running in ethernet mode at 10G and the 2 ports are LACP bonded, with a CX5 card for 100 rate IB. The older LNET routers have a ConnectX-4 IB card with IB rate 100 and a traditional 10G ethernet card with 2 10G and are LACP bonded. Not sure if this matters, but I wanted to mention it.



 Comments   
Comment by Michael Ethier (Inactive) [ 19/Feb/21 ]

Actually, those recoveryq messages I mentioned above may not be an issue in regards to the access problem I described. I can see those messages in /var/log/messages on one of the lustre FS - at much earlier times, like weeks ago - before we experienced the access issue.

Comment by Peter Jones [ 19/Feb/21 ]

Cyril

Could you please assist with this one?

Thanks

Peter

Comment by Michael Ethier (Inactive) [ 24/Feb/21 ]

Hello,
Anything else you need ?
Thanks,
Mike

Comment by Cyril Bordage [ 24/Feb/21 ]

Hello Michael,

sorry for the late answer, I had to take unexpected leave.

Could you provide the outputs of the following commands from the servers, the routers and several clients?

lnetctl peer show -v 4
lnetctl net show -v 4
lnetctl global show
lnetctl route show

 

Thank you.

Cyril.

Comment by Michael Ethier (Inactive) [ 25/Feb/21 ]

Hi Cyril, no worries. So those 2 lnet routers with issues do not have their lnet active to I can't run the commands. If we add routes to the Lustre storage and to some clients that's when we run into issues with accessing the Lustre storage afterwards.

[root@boslnet03 ~]# lnetctl peer show -v 4
^C
[root@boslnet03 ~]# lnetctl net show
net:

  • net type: lo
    local NI(s):
  • nid: 0@lo
    status: up
  • net type: tcp1
    local NI(s):
  • nid: 10.242.62.213@tcp1
    status: down
    interfaces:
    0: bond0
  • net type: o2ib1
    local NI(s):
  • nid: 10.242.46.227@o2ib1
    status: down
    interfaces:
    0: ib0

Thanks,
Mike

Comment by Cyril Bordage [ 26/Feb/21 ]

Hello Michael,

I do not get your "those 2 lnet routers with issues do not have their lnet active to I can't run the commands.". You mean you cannot mess up your working configuration by enabling them?

I will be difficult to diagnose with little information… To see what is going on I need to have all requested details with the exact commands I provided. If you cannot mess up with your production environment, could you provide the commands you used to configure everything, with the details of your network (nids of the servers, the routers, the clients), and all available information for servers and clients.

Thank you.

Cyril.

 

Comment by Michael Ethier (Inactive) [ 04/Mar/21 ]

Hi Cyril,
Sorry for the delay. Yes when we added the 2 new lnet routers to some of our lustre storage via lnet routes and set a few client with routes to access the storage via the lnet routers we had problems accessing the storage from other clients. I will gather the details you have ask for and reply back.
Thanks,
Mike

Comment by Michael Ethier (Inactive) [ 11/Mar/21 ]

Hi Cyril,

Your requested commands on a working lnet router:
lnetctl peer show -v 4 pegs the cpu to 100% and never returns on a lnet router.

[root@boslnet01 ~]# lnetctl net show -v 4
net:

  • net type: lo
    local NI(s):
  • nid: 0@lo
    status: up
    statistics:
    send_count: 0
    recv_count: 0
    drop_count: 0
    sent_stats:
    put: 0
    get: 0
    reply: 0
    ack: 0
    hello: 0
    received_stats:
    put: 0
    get: 0
    reply: 0
    ack: 0
    hello: 0
    dropped_stats:
    put: 0
    get: 0
    reply: 0
    ack: 0
    hello: 0
    health stats:
    health value: 0
    interrupts: 0
    dropped: 0
    aborted: 0
    no route: 0
    timeouts: 0
    error: 0
    tunables:
    peer_timeout: 0
    peer_credits: 0
    peer_buffer_credits: 0
    credits: 0
    dev cpt: 0
    tcp bonding: 0
    CPT: "[0,1]"
  • net type: tcp1
    local NI(s):
  • nid: 10.242.62.227@tcp1
    status: up
    interfaces:
    0: bond0
    statistics:
    send_count: 1263198586
    recv_count: 446196203
    drop_count: 827
    sent_stats:
    put: 1211118273
    get: 52080311
    reply: 2
    ack: 0
    hello: 0
    received_stats:
    put: 380078000
    get: 48000898
    reply: 3709225
    ack: 14408080
    hello: 0
    dropped_stats:
    put: 707
    get: 117
    reply: 3
    ack: 0
    hello: 0
    health stats:
    health value: 1000
    interrupts: 0
    dropped: 0
    aborted: 0
    no route: 0
    timeouts: 454
    error: 5397
    tunables:
    peer_timeout: 180
    peer_credits: 8
    peer_buffer_credits: 0
    credits: 256
    dev cpt: -1
    tcp bonding: 0
    CPT: "[0,1]"
  • net type: tcp2
    local NI(s):
  • nid: 10.242.107.16@tcp2
    status: up
    interfaces:
    0: p3p2
    statistics:
    send_count: 61007347
    recv_count: 60503869
    drop_count: 5
    sent_stats:
    put: 56550435
    get: 4456912
    reply: 0
    ack: 0
    hello: 0
    received_stats:
    put: 12820596
    get: 52672
    reply: 4359334
    ack: 43271267
    hello: 0
    dropped_stats:
    put: 4
    get: 0
    reply: 1
    ack: 0
    hello: 0
    health stats:
    health value: 1000
    interrupts: 0
    dropped: 0
    aborted: 0
    no route: 0
    timeouts: 0
    error: 0
    tunables:
    peer_timeout: 180
    peer_credits: 8
    peer_buffer_credits: 0
    credits: 256
    dev cpt: 0
    tcp bonding: 0
    CPT: "[0,1]"
  • net type: o2ib1
    local NI(s):
  • nid: 10.242.46.216@o2ib1
    status: up
    interfaces:
    0: ib0
    statistics:
    send_count: 459118144
    recv_count: 1276624005
    drop_count: 457262
    sent_stats:
    put: 392898596
    get: 471643
    reply: 8068558
    ack: 57679347
    hello: 0
    received_stats:
    put: 1267668708
    get: 8955285
    reply: 12
    ack: 0
    hello: 0
    dropped_stats:
    put: 457250
    get: 0
    reply: 12
    ack: 0
    hello: 0
    health stats:
    health value: 1000
    interrupts: 0
    dropped: 62
    aborted: 0
    no route: 0
    timeouts: 350
    error: 0
    tunables:
    peer_timeout: 180
    peer_credits: 8
    peer_buffer_credits: 0
    credits: 256
    peercredits_hiw: 4
    map_on_demand: 0
    concurrent_sends: 8
    fmr_pool_size: 512
    fmr_flush_trigger: 384
    fmr_cache: 1
    ntx: 512
    conns_per_peer: 1
    lnd tunables:
    dev cpt: 1
    tcp bonding: 0
    CPT: "[0,1]"
    [root@boslnet01 ~]# lnetctl global show
    global:
    numa_range: 0
    max_intf: 200
    discovery: 0
    drop_asym_route: 0
    retry_count: 0
    transaction_timeout: 50
    health_sensitivity: 0
    recovery_interval: 1
    [root@boslnet01 ~]# lnetctl route show
    [root@boslnet01 ~]#

Your commands on a non-working lnet router:
[root@boslnet03 ~]# lnetctl net show -v 4
net:

  • net type: lo
    local NI(s):
  • nid: 0@lo
    status: up
    statistics:
    send_count: 0
    recv_count: 0
    drop_count: 0
    sent_stats:
    put: 0
    get: 0
    reply: 0
    ack: 0
    hello: 0
    received_stats:
    put: 0
    get: 0
    reply: 0
    ack: 0
    hello: 0
    dropped_stats:
    put: 0
    get: 0
    reply: 0
    ack: 0
    hello: 0
    health stats:
    health value: 0
    interrupts: 0
    dropped: 0
    aborted: 0
    no route: 0
    timeouts: 0
    error: 0
    tunables:
    peer_timeout: 0
    peer_credits: 0
    peer_buffer_credits: 0
    credits: 0
    dev cpt: 0
    tcp bonding: 0
    CPT: "[0]"
  • net type: tcp1
    local NI(s):
  • nid: 10.242.62.213@tcp1
    status: down
    interfaces:
    0: bond0
    statistics:
    send_count: 9937720
    recv_count: 13722
    drop_count: 276772
    sent_stats:
    put: 9913812
    get: 23902
    reply: 6
    ack: 0
    hello: 0
    received_stats:
    put: 7145
    get: 1445
    reply: 5121
    ack: 11
    hello: 0
    dropped_stats:
    put: 276766
    get: 0
    reply: 6
    ack: 0
    hello: 0
    health stats:
    health value: 1000
    interrupts: 0
    dropped: 0
    aborted: 0
    no route: 0
    timeouts: 738
    error: 3
    tunables:
    peer_timeout: 180
    peer_credits: 8
    peer_buffer_credits: 0
    credits: 256
    dev cpt: -1
    tcp bonding: 0
    CPT: "[0]"
  • net type: o2ib1
    local NI(s):
  • nid: 10.242.46.227@o2ib1
    status: down
    interfaces:
    0: ib0
    statistics:
    send_count: 217189
    recv_count: 10141187
    drop_count: 10
    sent_stats:
    put: 7145
    get: 204912
    reply: 5121
    ack: 11
    hello: 0
    received_stats:
    put: 9913812
    get: 227369
    reply: 6
    ack: 0
    hello: 0
    dropped_stats:
    put: 10
    get: 0
    reply: 0
    ack: 0
    hello: 0
    health stats:
    health value: 1000
    interrupts: 0
    dropped: 0
    aborted: 0
    no route: 0
    timeouts: 0
    error: 0
    tunables:
    peer_timeout: 180
    peer_credits: 8
    peer_buffer_credits: 0
    credits: 256
    peercredits_hiw: 4
    map_on_demand: 0
    concurrent_sends: 8
    fmr_pool_size: 512
    fmr_flush_trigger: 384
    fmr_cache: 1
    ntx: 512
    conns_per_peer: 1
    lnd tunables:
    dev cpt: 0
    tcp bonding: 0
    CPT: "[0]"
    [root@boslnet03 ~]# lnetctl global show
    global:
    numa_range: 0
    max_intf: 200
    discovery: 0
    drop_asym_route: 0
    retry_count: 0
    transaction_timeout: 50
    health_sensitivity: 0
    recovery_interval: 1
    [root@boslnet03 ~]# lnetctl route show
    [root@boslnet03 ~]#

Your commands on a Lustre storage node (MDS):
[root@boslfs02mds01 ~]# lnetctl net show -v 4
net:

  • net type: lo
    local NI(s):
  • nid: 0@lo
    status: up
    statistics:
    send_count: 2888768
    recv_count: 2888764
    drop_count: 4
    sent_stats:
    put: 2888768
    get: 0
    reply: 0
    ack: 0
    hello: 0
    received_stats:
    put: 2190548
    get: 0
    reply: 0
    ack: 698216
    hello: 0
    dropped_stats:
    put: 4
    get: 0
    reply: 0
    ack: 0
    hello: 0
    health stats:
    health value: 0
    interrupts: 0
    dropped: 0
    aborted: 0
    no route: 0
    timeouts: 0
    error: 0
    tunables:
    peer_timeout: 0
    peer_credits: 0
    peer_buffer_credits: 0
    credits: 0
    dev cpt: 0
    tcp bonding: 0
    CPT: "[0,1,2,3]"
  • net type: o2ib1
    local NI(s):
  • nid: 10.242.46.233@o2ib1
    status: up
    interfaces:
    0: ib0
    statistics:
    send_count: 120142564
    recv_count: 120852294
    drop_count: 22158
    sent_stats:
    put: 120091641
    get: 50923
    reply: 0
    ack: 0
    hello: 0
    received_stats:
    put: 119667645
    get: 13605
    reply: 37318
    ack: 1133726
    hello: 0
    dropped_stats:
    put: 18448
    get: 0
    reply: 0
    ack: 3710
    hello: 0
    health stats:
    health value: 1000
    interrupts: 0
    dropped: 1804948
    aborted: 0
    no route: 0
    timeouts: 0
    error: 0
    tunables:
    peer_timeout: 180
    peer_credits: 8
    peer_buffer_credits: 0
    credits: 256
    peercredits_hiw: 4
    map_on_demand: 0
    concurrent_sends: 8
    fmr_pool_size: 512
    fmr_flush_trigger: 384
    fmr_cache: 1
    ntx: 512
    conns_per_peer: 1
    lnd tunables:
    dev cpt: 3
    tcp bonding: 0
    CPT: "[0,1,2,3]"
    [root@boslfs02mds01 ~]# lnetctl global show
    global:
    numa_range: 0
    max_intf: 200
    discovery: 0
    drop_asym_route: 0
    retry_count: 0
    transaction_timeout: 50
    health_sensitivity: 100
    recovery_interval: 1
    [root@boslfs02mds01 ~]# lnetctl route show
    route:
  • net: tcp1
    gateway: 10.242.46.216@o2ib1
  • net: tcp1
    gateway: 10.242.46.217@o2ib1
  • net: tcp2
    gateway: 10.242.46.217@o2ib1
  • net: tcp2
    gateway: 10.242.46.216@o2ib1
Comment by Michael Ethier (Inactive) [ 11/Mar/21 ]

Details of existing working lnet router:
[root@boslnet01 ~]# dkms status
lustre-client, 2.12.4, 3.10.0-1062.18.1.el7.x86_64, x86_64: installed
[root@boslnet01 ~]# more /etc/modprobe.d/lustre.conf
options lnet networks="tcp1(bond0), tcp2(p3p2), o2ib1(ib0)"
options lnet forwarding="enabled"
options lnet lnet_peer_discovery_disabled=1
options lnet lnet_health_sensitivity=0
[root@boslnet01 ~]# lnetctl net show
net:

  • net type: lo
    local NI(s):
  • nid: 0@lo
    status: up
  • net type: tcp1
    local NI(s):
  • nid: 10.242.62.227@tcp1
    status: up
    interfaces:
    0: bond0
  • net type: tcp2
    local NI(s):
  • nid: 10.242.107.16@tcp2
    status: up
    interfaces:
    0: p3p2
  • net type: o2ib1
    local NI(s):
  • nid: 10.242.46.216@o2ib1
    status: up
    interfaces:
    0: ib0
    [root@boslnet01 ~]# more /etc/redhat-release
    CentOS Linux release 7.7.1908 (Core)
    [root@boslnet01 ~]# lspci |grep -i mell
    81:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
    [root@boslnet01 ~]# ibstat
    CA 'mlx5_0'
    CA type: MT4115
    Number of ports: 1
    Firmware version: 12.17.2052
    Hardware version: 0
    Node GUID: 0x98039b0300bec2c8
    System image GUID: 0x98039b0300bec2c8
    Port 1:
    State: Active
    Physical state: LinkUp
    Rate: 100
    Base lid: 44
    LMC: 0
    SM lid: 1
    Capability mask: 0x2651e848
    Port GUID: 0x98039b0300bec2c8
    Link layer: InfiniBand
    [root@boslnet01 ~]# ifconfig
    bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
    inet 10.242.62.227 netmask 255.255.255.0 broadcast 10.242.62.255
    inet6 fe80::a236:9fff:fe8d:9774 prefixlen 64 scopeid 0x20<link>
    ether a0:36:9f:8d:97:74 txqueuelen 1000 (Ethernet)
    RX packets 5311208385 bytes 4185414327252 (3.8 TiB)
    RX errors 0 dropped 0 overruns 0 frame 0
    TX packets 11038342648 bytes 14227372363039 (12.9 TiB)
    TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
inet 10.242.46.216 netmask 255.255.252.0 broadcast 10.242.47.255
inet6 fe80::9a03:9b03:be:c2c8 prefixlen 64 scopeid 0x20<link>
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
infiniband 20:00:00:68:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
RX packets 29806 bytes 4627169 (4.4 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 837 bytes 50408 (49.2 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 743598 bytes 71604808 (68.2 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 743598 bytes 71604808 (68.2 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

p1p1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether a0:36:9f:8d:97:74 txqueuelen 1000 (Ethernet)
RX packets 2983098310 bytes 2267051063818 (2.0 TiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 6235321147 bytes 8498451034677 (7.7 TiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

p1p2: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether a0:36:9f:8d:97:74 txqueuelen 1000 (Ethernet)
RX packets 2328110076 bytes 1918363263500 (1.7 TiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4803021510 bytes 5728921330120 (5.2 TiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

p3p1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 3c:fd:fe:16:01:88 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

p3p2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.242.107.16 netmask 255.255.255.192 broadcast 10.242.107.63
inet6 fe80::3efd:feff:fe16:18a prefixlen 64 scopeid 0x20<link>
ether 3c:fd:fe:16:01:8a txqueuelen 1000 (Ethernet)
RX packets 3292939462 bytes 4737754838918 (4.3 TiB)
RX errors 0 dropped 667 overruns 0 frame 0
TX packets 31792664489 bytes 47939135623196 (43.6 TiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

Details of one of our lustre storage servers:

[root@boslfs02mds01 ~]# rpm -qa |grep lustre
kernel-3.10.0-1062.1.1.el7_lustre.x86_64
kmod-lustre-2.12.3-1.el7.x86_64
kmod-lustre-osd-ldiskfs-2.12.3-1.el7.x86_64
kernel-devel-3.10.0-1062.1.1.el7_lustre.x86_64
lustre-osd-zfs-mount-2.12.3-1.el7.x86_64
lustre-2.12.3-1.el7.x86_64
lustre-zfs-dkms-2.12.3-1.el7.noarch
lustre-resource-agents-2.12.3-1.el7.x86_64
lustre-ldiskfs-zfs-5.0.0-1.el7.x86_64
lustre-osd-ldiskfs-mount-2.12.3-1.el7.x86_64
kmod-spl-3.10.0-1062.1.1.el7_lustre.x86_64-0.7.13-1.el7.x86_64
[root@boslfs02mds01 ~]# more /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
[root@boslfs02mds01 ~]# more /etc/modprobe.d/lustre.conf
options lnet networks="o2ib1(ib0)" routes="tcp1 10.242.46.[216-217]@o2ib1; tcp2 10.242.46.[216-217]@o2ib1"
options lnet lnet_transaction_timeout=50 lnet_retry_count=0 lnet_peer_discovery_disabled=1
[root@boslfs02mds01 ~]# lnetctl route show
route:

  • net: tcp1
    gateway: 10.242.46.216@o2ib1
  • net: tcp1
    gateway: 10.242.46.217@o2ib1
  • net: tcp2
    gateway: 10.242.46.217@o2ib1
  • net: tcp2
    gateway: 10.242.46.216@o2ib1
    [root@boslfs02mds01 ~]# lspci |grep -i mell
    d8:00.0 Infiniband controller: Mellanox Technologies MT27800 Family [ConnectX-5]
    [root@boslfs02mds01 ~]# lspci |grep -i eth
    01:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
    01:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
    19:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
    19:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
    [root@boslfs02mds01 ~]# ibstat
    CA 'mlx5_0'
    CA type: MT4119
    Number of ports: 1
    Firmware version: 16.25.1020
    Hardware version: 0
    Node GUID: 0xb8599f030005a4ec
    System image GUID: 0xb8599f030005a4ec
    Port 1:
    State: Active
    Physical state: LinkUp
    Rate: 100
    Base lid: 20
    LMC: 0
    SM lid: 1
    Capability mask: 0x2651e848
    Port GUID: 0xb8599f030005a4ec
    Link layer: InfiniBand
    [root@boslfs02mds01 ~]# ifconfig
    bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
    inet 10.242.62.139 netmask 255.255.255.0 broadcast 10.242.62.255
    inet6 fe80::e643:4bff:fe21:6750 prefixlen 64 scopeid 0x20<link>
    ether e4:43:4b:21:67:50 txqueuelen 1000 (Ethernet)
    RX packets 244991196 bytes 40288519866 (37.5 GiB)
    RX errors 0 dropped 689357 overruns 0 frame 0
    TX packets 243506307 bytes 25816873448 (24.0 GiB)
    TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

em1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether e4:43:4b:21:67:50 txqueuelen 1000 (Ethernet)
RX packets 124720954 bytes 20746393497 (19.3 GiB)
RX errors 0 dropped 19702 overruns 0 frame 0
TX packets 121422150 bytes 12632871392 (11.7 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

em2: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether e4:43:4b:21:67:50 txqueuelen 1000 (Ethernet)
RX packets 120270244 bytes 19542126677 (18.2 GiB)
RX errors 0 dropped 315129 overruns 0 frame 0
TX packets 122084164 bytes 13184003240 (12.2 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

em3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.0.139 netmask 255.255.255.0 broadcast 10.0.0.255
inet6 fe80::e643:4bff:fe21:6754 prefixlen 64 scopeid 0x20<link>
ether e4:43:4b:21:67:54 txqueuelen 1000 (Ethernet)
RX packets 228132 bytes 31718009 (30.2 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 328790 bytes 52526191 (50.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0x92980000-929fffff

em4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.242.112.176 netmask 255.255.255.128 broadcast 10.242.112.255
inet6 fe80::e643:4bff:fe21:6755 prefixlen 64 scopeid 0x20<link>
ether e4:43:4b:21:67:55 txqueuelen 1000 (Ethernet)
RX packets 26539 bytes 1600620 (1.5 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 210 bytes 20138 (19.6 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0x92900000-9297ffff

ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
inet 10.242.46.233 netmask 255.255.252.0 broadcast 10.242.47.255
inet6 fe80::ba59:9f03:5:a4ec prefixlen 64 scopeid 0x20<link>
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
infiniband 20:00:07:EB:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
RX packets 10537 bytes 1677139 (1.5 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 14 bytes 984 (984.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

idrac: flags=67<UP,BROADCAST,RUNNING> mtu 1500
inet6 fe80::4ed9:8fff:fe7c:c104 prefixlen 64 scopeid 0x20<link>
ether 4c:d9:8f:7c:c1:04 txqueuelen 1000 (Ethernet)
RX packets 690330 bytes 56738080 (54.1 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 808171 bytes 76845763 (73.2 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 20976 bytes 4281290 (4.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 20976 bytes 4281290 (4.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

Deails of a lnet router that doesn't work:

[root@boslnet03 ~]# dkms status
lustre-client, 2.12.4, 3.10.0-1062.18.1.el7.x86_64, x86_64: installed
[root@boslnet03 ~]# more /etc/modprobe.d/lustre.conf
options lnet networks="tcp1(bond0), o2ib1(ib0)"
options lnet forwarding="enabled"
options lnet lnet_peer_discovery_disabled=1
options lnet lnet_health_sensitivity=0
[root@boslnet03 ~]# lnetctl net show
net:

  • net type: lo
    local NI(s):
  • nid: 0@lo
    status: up
  • net type: tcp1
    local NI(s):
  • nid: 10.242.62.213@tcp1
    status: down
    interfaces:
    0: bond0
  • net type: o2ib1
    local NI(s):
  • nid: 10.242.46.227@o2ib1
    status: down
    interfaces:
    0: ib0
    [root@boslnet03 ~]# more /etc/redhat-release
    CentOS Linux release 7.7.1908 (Core)
    [root@boslnet03 ~]# lspci |grep -i mell
    08:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
    08:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
    5b:00.0 Infiniband controller: Mellanox Technologies MT27800 Family [ConnectX-5]
    [root@boslfs02mds01 ~]# lspci |grep -i eth
    01:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
    01:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
    19:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
    19:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
    [root@boslnet03 ~]# ibstat
    CA 'mlx5_2'
    CA type: MT4119
    Number of ports: 1
    Firmware version: 16.29.1016
    Hardware version: 0
    Node GUID: 0x98039b0300b4146e
    System image GUID: 0x98039b0300b4146e
    Port 1:
    State: Active
    Physical state: LinkUp
    Rate: 100
    Base lid: 26
    LMC: 0
    SM lid: 1
    Capability mask: 0x2651e848
    Port GUID: 0x98039b0300b4146e
    Link layer: InfiniBand
    CA 'mlx5_bond_0'
    CA type: MT4117
    Number of ports: 1
    Firmware version: 14.29.1016
    Hardware version: 0
    Node GUID: 0x98039b0300cbcb34
    System image GUID: 0x98039b0300cbcb34
    Port 1:
    State: Active
    Physical state: LinkUp
    Rate: 10
    Base lid: 0
    LMC: 0
    SM lid: 0
    Capability mask: 0x00010000
    Port GUID: 0x9a039bfffecbcb34
    Link layer: Ethernet
    [root@boslnet03 ~]# ifconfig
    bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
    inet 10.242.62.213 netmask 255.255.255.0 broadcast 10.242.62.255
    inet6 fe80::9a03:9bff:fecb:cb34 prefixlen 64 scopeid 0x20<link>
    ether 98:03:9b:cb:cb:34 txqueuelen 1000 (Ethernet)
    RX packets 493962473 bytes 39340657521 (36.6 GiB)
    RX errors 0 dropped 0 overruns 0 frame 0
    TX packets 515219003 bytes 85671938800 (79.7 GiB)
    TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

enp0s20f0u1u6: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 169.254.95.120 netmask 255.255.255.0 broadcast 169.254.95.255
inet6 fe80::3868:ddff:fe0c:29f7 prefixlen 64 scopeid 0x20<link>
ether 3a:68:dd:0c:29:f7 txqueuelen 1000 (Ethernet)
RX packets 2505736 bytes 240636075 (229.4 MiB)
RX errors 1 dropped 0 overruns 0 frame 0
TX packets 2501477 bytes 274965254 (262.2 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
inet 10.242.46.227 netmask 255.255.252.0 broadcast 10.242.47.255
inet6 fe80::9a03:9b03:b4:146e prefixlen 64 scopeid 0x20<link>
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
infiniband 20:00:10:29:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
RX packets 39137 bytes 6016144 (5.7 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 50 bytes 3144 (3.0 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 260 bytes 21314 (20.8 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 260 bytes 21314 (20.8 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

p1p1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether 98:03:9b:cb:cb:34 txqueuelen 1000 (Ethernet)
RX packets 276852635 bytes 24386554325 (22.7 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 295567994 bytes 70653344774 (65.8 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

p1p2: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether 98:03:9b:cb:cb:34 txqueuelen 1000 (Ethernet)
RX packets 217109839 bytes 14954103262 (13.9 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 219651019 bytes 15018595778 (13.9 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

Comment by Serguei Smirnov [ 18/Mar/21 ]

Michael,

Would it be possible to set up a live debugging session so that we can go through the procedure of adding a new router together? 

I haven't seen anything wrong in the logs you provided, but the errors mentioned in the description could be explained by mis-configuration.

The procedure should be roughly as follows:

1) Setup router node: configure lnet, start lnet, verify by dumping "lnetctl net show", and by "lnetctl pinging" to and from peers on both nets (tcp1, o2ib1) multiple times.

2) Add the route on the server and on the client, list the new router as the gw to use to reach respective nets.

3) Verify by "lnetctl pinging" multiple times across the new router (a tcp1 client to server and back).

4) Verify by "lnetctl pinging" multiple times across the new router using a client on tcp2 (from the logs provided, the new router handles only tcp1, so let's verify that tcp2 is still good).

5) If all looks good so far, try mounting

Comment by Michael Ethier (Inactive) [ 23/Mar/21 ]

Hi, i'm attaching 2 logs files per Serguei's request. Thanks, Mike. log.txt log1.txt

Comment by Serguei Smirnov [ 01/Apr/21 ]

During the today's call we found out that the new router's IP may need to be added to the access control list for a group of clients. Regular ping from the client to the router was going through, but lnetctl ping was not. Because lnetctl ping was part of the procedure we used earlier, failed lnetctl ping we're seeing now may not explain the behaviour we were seeing before. We'll proceed once the ACL issue is out of the way.

Comment by Michael Ethier (Inactive) [ 01/Apr/21 ]

I verified that from the client we can lnet ping the 2 new lnet routers now. There were ACLs blocking access.
[root@holylogin01 ~]# lnetctl ping 10.242.62.214@tcp1
ping:

  • primary nid: 10.242.62.214@tcp1
    Multi-Rail: False
    peer ni:
  • nid: 10.242.62.214@tcp1
  • nid: 10.242.46.228@o2ib1
    [root@holylogin01 ~]# lnetctl ping 10.242.62.213@tcp1
    ping:
  • primary nid: 10.242.62.213@tcp1
    Multi-Rail: False
    peer ni:
  • nid: 10.242.62.213@tcp1
  • nid: 10.242.46.227@o2ib1
Comment by Michael Ethier (Inactive) [ 02/Apr/21 ]

It has been verified that network ACLs were causing the issue. We have successfully added a new LNET router and Lustre FS access seems to be fine now. This ticket can be closed. Thanks to Serguei for all his help, much appreciated.

Comment by Peter Jones [ 03/Apr/21 ]

Great - thanks for the update

Comment by Serguei Smirnov [ 03/Apr/21 ]

Michael reported later yesterday via e-mail that the clients which didn't get the route to the new gateway setup had issues with accessing the FS. This should not have happened unless asymmetric routes are configured to be dropped on the clients.

Comment by Serguei Smirnov [ 05/May/21 ]

Michael recently reported via email:

"I wanted to let you know we finally added those 2 lnet routers we were working previously, globally to our cluster in Holyoke/boston and they are now running in production.

It appears the procedure requires that you add the lnet routes across all the clients that are mounting the storage on the other side of the routers and then add lnet routes

to the storage side after that."

Michael believes that the ticket can be closed.

Generated at Sat Feb 10 03:09:54 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.