Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15622

LNET/o2ib doesn't know LNET state properly in RoCE configuration

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      here is RoCE setup

      [root@es200nvx-vm1 ~]# ibstat
      CA 'mlx5_0'
      	CA type: MT4123
      	Number of ports: 1
      	Firmware version: 20.30.1004
      	Hardware version: 0
      	Node GUID: 0x0c42a10300ae2a4e
      	System image GUID: 0x0c42a10300ae2a4e
      	Port 1:
      		State: Active
      		Physical state: LinkUp
      		Rate: 100
      		Base lid: 0
      		LMC: 0
      		SM lid: 0
      		Capability mask: 0x00010000
      		Port GUID: 0x0e42a1fffeae2a4e
      		Link layer: Ethernet
      CA 'mlx5_1'
      	CA type: MT4123
      	Number of ports: 1
      	Firmware version: 20.30.1004
      	Hardware version: 0
      	Node GUID: 0x0c42a10300ae2a4f
      	System image GUID: 0x0c42a10300ae2a4e
      	Port 1:
      		State: Active
      		Physical state: LinkUp
      		Rate: 100
      		Base lid: 0
      		LMC: 0
      		SM lid: 0
      		Capability mask: 0x00010000
      		Port GUID: 0x0e42a1fffeae2a4f
      		Link layer: Ethernet
      
      [root@es200nvx-vm1 ~]# ip link show
      1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
          link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
      2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether 02:00:70:ea:1e:d1 brd ff:ff:ff:ff:ff:ff
      3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
          link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
      4: ens1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether 0c:42:a1:ae:2a:4e brd ff:ff:ff:ff:ff:ff
      5: ens2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether 0c:42:a1:ae:2a:4f brd ff:ff:ff:ff:ff:ff
      
      [root@es200nvx-vm1 ~]# lnetctl net show
      net:
          - net type: lo
            local NI(s):
              - nid: 0@lo
                status: up
          - net type: o2ib12
            local NI(s):
              - nid: 192.168.11.232@o2ib12
                status: up
                interfaces:
                    0: ens1
              - nid: 192.168.11.242@o2ib12
                status: up
                interfaces:
      

      Turn one network interface off physically.

      [root@es200nvx-vm1 ~]# ibstat
      CA 'mlx5_0'
      	CA type: MT4123
      	Number of ports: 1
      	Firmware version: 20.30.1004
      	Hardware version: 0
      	Node GUID: 0x0c42a10300ae2a4e
      	System image GUID: 0x0c42a10300ae2a4e
      	Port 1:
      		State: Active
      		Physical state: LinkUp
      		Rate: 100
      		Base lid: 0
      		LMC: 0
      		SM lid: 0
      		Capability mask: 0x00010000
      		Port GUID: 0x0e42a1fffeae2a4e
      		Link layer: Ethernet
      CA 'mlx5_1'
      	CA type: MT4123
      	Number of ports: 1
      	Firmware version: 20.30.1004
      	Hardware version: 0
      	Node GUID: 0x0c42a10300ae2a4f
      	System image GUID: 0x0c42a10300ae2a4e
      	Port 1:
      		State: Down
      		Physical state: Disabled
      		Rate: 40
      		Base lid: 0
      		LMC: 0
      		SM lid: 0
      		Capability mask: 0x00010000
      		Port GUID: 0x0e42a1fffeae2a4f
      		Link layer: Ethernet
      
      [root@es200nvx-vm1 ~]# ip link 
      1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
          link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
      2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether 02:00:70:ea:1e:d1 brd ff:ff:ff:ff:ff:ff
      3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
          link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
      4: ens1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether 0c:42:a1:ae:2a:4e brd ff:ff:ff:ff:ff:ff
      5: ens2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
          link/ether 0c:42:a1:ae:2a:4f brd ff:ff:ff:ff:ff:ff
      NID status never changed "down" from "up" properly even if physical network interface downed.
      

      However, LNET state is up.

      [root@es200nvx-vm1 ~]# lnetctl net show
      net:
          - net type: lo
            local NI(s):
              - nid: 0@lo
                status: up
          - net type: o2ib12
            local NI(s):
              - nid: 192.168.11.232@o2ib12
                status: up
                interfaces:
                    0: ens1
              - nid: 192.168.11.242@o2ib12
                status: up
                interfaces:
                    0: ens2
      

      Attachments

        Issue Links

          Activity

            People

              cbordage Cyril Bordage
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: