[LU-15138] lnetctl peer add doesn't detect duplicate router peer Created: 20/Oct/21  Updated: 20/Nov/21  Resolved: 20/Nov/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Minor
Reporter: Chris Horn Assignee: Chris Horn
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

It seems it is possible to add duplicate peer entries for a router:

nid00060:~ # lnetctl peer show
peer:
    - primary nid: 93@gni4
      Multi-Rail: False
      peer ni:
        - nid: 93@gni4
          state: up
    - primary nid: 486@gni4
      Multi-Rail: False
      peer ni:
        - nid: 486@gni4
          state: up
    - primary nid: 485@gni4
      Multi-Rail: True
      peer ni:
        - nid: 485@gni4
          state: up
        - nid: 10.12.0.1@o2ib4000
          state: up
    - primary nid: 476@gni4
      Multi-Rail: False
      peer ni:
        - nid: 476@gni4
          state: NA
    - primary nid: 484@gni4
      Multi-Rail: False
      peer ni:
        - nid: 484@gni4
          state: NA
    - primary nid: 487@gni4
      Multi-Rail: False
      peer ni:
        - nid: 487@gni4
          state: NA
    - primary nid: 94@gni4
      Multi-Rail: False
      peer ni:
        - nid: 94@gni4
          state: up
nid00060:~ # lnetctl peer add --prim 485@gni4
nid00060:~ # lnetctl peer show
peer:
    - primary nid: 93@gni4
      Multi-Rail: False
      peer ni:
        - nid: 93@gni4
          state: up
    - primary nid: 486@gni4
      Multi-Rail: False
      peer ni:
        - nid: 486@gni4
          state: up
    - primary nid: 485@gni4
      Multi-Rail: True
      peer ni:
        - nid: 485@gni4
          state: up
        - nid: 10.12.0.1@o2ib4000
          state: up
    - primary nid: 476@gni4
      Multi-Rail: False
      peer ni:
        - nid: 476@gni4
          state: NA
    - primary nid: 484@gni4
      Multi-Rail: False
      peer ni:
        - nid: 484@gni4
          state: NA
    - primary nid: 487@gni4
      Multi-Rail: False
      peer ni:
        - nid: 487@gni4
          state: NA
    - primary nid: 94@gni4
      Multi-Rail: False
      peer ni:
        - nid: 94@gni4
          state: up
    - primary nid: 485@gni4
      Multi-Rail: True
      peer ni:
        - nid: 485@gni4
          state: up
        - nid: 10.12.0.1@o2ib4000
          state: up
nid00060:~ #
00000400:00000010:26.0F:1634755755.341631:0:17206:0:(module.c:166:libcfs_ioctl_getdata()) alloc '(*hdr_pp)': 48 at 00000000e13a401b (tot 14747117).
00000400:00000080:26.0:1634755755.341633:0:17206:0:(module.c:211:libcfs_ioctl()) libcfs ioctl cmd 3233310044
00000400:00000200:26.0:1634755755.341637:0:17206:0:(peer.c:460:lnet_peer_del_locked()) peer 485@gni4
00000400:00020000:26.0:1634755755.341638:0:17206:0:(peer.c:394:lnet_peer_ni_del_locked()) Peer NI 485@gni4 is a gateway. Can not delete it
00000400:00020000:26.0:1634755755.375967:0:17206:0:(peer.c:394:lnet_peer_ni_del_locked()) Peer NI 10.12.0.1@o2ib4000 is a gateway. Can not delete it
00000400:00000010:26.0:1634755755.375969:0:17206:0:(peer.c:251:lnet_peer_alloc()) alloc '(lp)': 304 at 00000000dda971cd (tot 14747421).
00000400:00000200:26.0:1634755755.375970:0:17206:0:(peer.c:290:lnet_peer_alloc()) 00000000dda971cd nid 485@gni4
00000400:00000010:26.0:1634755755.375972:0:17206:0:(peer.c:215:lnet_peer_net_alloc()) alloc '(lpn)': 72 at 000000008508af39 (tot 14747493).
00000400:00000200:26.0:1634755755.375973:0:17206:0:(peer.c:224:lnet_peer_net_alloc()) 000000008508af39 net gni4
00000400:00000010:26.0:1634755755.375975:0:17206:0:(peer.c:160:lnet_peer_ni_alloc()) alloc '(lpni)': 328 at 00000000cd43d022 (tot 14747821).
00000400:00000200:26.0:1634755755.375976:0:17206:0:(peer.c:205:lnet_peer_ni_alloc()) 00000000cd43d022 nid 485@gni4
00000400:00000200:26.0:1634755755.375977:0:17206:0:(peer.c:1148:lnet_peer_ni_clr_non_mr_pref_nid()) peer 485@gni4: -2
00000400:00000200:26.0:1634755755.375978:0:17206:0:(peer.c:1609:lnet_peer_attach_peer_ni()) peer 485@gni4 NID 485@gni4 flags 0x9
00000400:00000010:26.0:1634755755.375980:0:17206:0:(module.c:238:libcfs_ioctl()) kfreed 'hdr': 48 at 00000000e13a401b (tot 14747773).
Debug log: 14 lines, 14 kept, 0 dropped, 0 bad.


 Comments   
Comment by Gerrit Updater [ 22/Oct/21 ]

"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45337
Subject: LU-15138 lnet: Fail peer add for existing gw peer
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7e537d20d1e3fc73560784d3be646373de4a0cb5

Comment by Gerrit Updater [ 20/Nov/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45337/
Subject: LU-15138 lnet: Fail peer add for existing gw peer
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 79a4b69adb1e365b16eb8521c79ef1c6985c6b91

Comment by Peter Jones [ 20/Nov/21 ]

Landed for 2.15

Generated at Sat Feb 10 03:15:47 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.