Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13883

LNetPrimaryNID assumes lpni for a particular NID will not change through discovery but lnet_peer_add_nid() may allocate a new one for it.

Details

    • 3
    • 9223372036854775807

    Description

      After server reboot (failover/failback test) client connnects to server using non-primary NID 192.168.2.38@tcp2. Server calls LNetPrimaryNID to setup export.

      New peer is setup and queued for discovery:

      00000020:00000040:0.0:1596142526.904195:0:3881:0:(genops.c:1009:class_export_get()) GETting export ffff9efa080c4000 : new refcount 6
      00000100:00000040:0.0:1596142526.904198:0:3881:0:(service.c:1054:ptlrpc_request_change_export()) RPC GETting export ffff9efa080c4000 : new rpc_count 1
      00000400:00000200:0.0:1596142526.904210:0:3881:0:(peer.c:285:lnet_peer_alloc()) ffff9efa0c071c00 nid 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.904215:0:3881:0:(peer.c:220:lnet_peer_net_alloc()) ffff9efa22389a40 net tcp2
      00000400:00000200:0.0:1596142526.904222:0:3881:0:(peer.c:202:lnet_peer_ni_alloc()) ffff9efa0c550000 nid 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.904228:0:3881:0:(peer.c:1329:lnet_peer_attach_peer_ni()) peer 192.168.2.38@tcp2 NID 192.168.2.38@tcp2 flags 0x0
      00000400:00000200:0.0:1596142526.904233:0:3881:0:(lib-lnet.h:96:lnet_peer_set_state()) Peer 192.168.2.38@tcp2(ffff9efa0c071c00) 8192 state 0x2000
      00000400:00000200:0.0:1596142526.904236:0:3881:0:(lib-lnet.h:96:lnet_peer_set_state()) Peer 192.168.2.38@tcp2(ffff9efa0c071c00) 16384 state 0x6000
      00000400:00000200:0.0:1596142526.904241:0:3881:0:(lib-lnet.h:96:lnet_peer_set_state()) Peer 192.168.2.38@tcp2(ffff9efa0c071c00) 64 state 0x6040
      00000400:00000200:0.0:1596142526.904249:0:3881:0:(peer.c:1931:lnet_peer_queue_for_discovery()) Queue peer 192.168.2.38@tcp2: 0
      00000400:00000200:0.0:1596142526.904251:0:3881:0:(peer.c:2244:lnet_discover_peer_locked()) Discovery attempt # 1
      

      Discovery ping is sent:

      00000400:00000200:0.0:1596142526.904351:0:2860:0:(peer.c:3342:lnet_peer_discovery_wait_for_work()) woken: 0
      00000400:00000200:0.0:1596142526.904355:0:2860:0:(peer.c:3457:lnet_peer_discovery()) peer 192.168.2.38@tcp2(ffff9efa0c071c00) state 0x6040
      00000400:00000200:0.0:1596142526.904359:0:2860:0:(peer.c:3057:lnet_peer_send_ping()) peer 192.168.2.38@tcp2(ffff9efa0c071c00)
      00000400:00000200:0.0:1596142526.904362:0:2860:0:(lib-lnet.h:96:lnet_peer_set_state()) Peer 192.168.2.38@tcp2(ffff9efa0c071c00) 512 state 0x6240
      00000400:00000200:0.0:1596142526.904365:0:2860:0:(lib-lnet.h:104:lnet_peer_clear_state()) Peer 192.168.2.38@tcp2(ffff9efa0c071c00) 8192 state 0x4240
      00000400:00000200:0.0:1596142526.904369:0:2860:0:(peer.c:3008:lnet_peer_select_nid()) peer 192.168.2.38@tcp2(ffff9efa0c071c00)
      00000400:00000200:0.0:1596142526.904381:0:2860:0:(lib-move.c:5126:LNetGet()) LNetGet -> 12345-192.168.2.38@tcp2
      

      Ping reply is processed.

      00000400:00000200:0.0:1596142526.905282:0:2860:0:(peer.c:3342:lnet_peer_discovery_wait_for_work()) woken: 0
      00000400:00000200:0.0:1596142526.905287:0:2860:0:(peer.c:3457:lnet_peer_discovery()) peer 192.168.2.38@tcp2(ffff9efa0c071c00) state 0x40c1
      

      Discovery thread looks up the primary nid in the ping buffer and it finds an existing lpni (some other request from the same client arrived from the client's 192.168.2.38@tcp99 NID).

      00000400:00000200:0.0:1596142526.905291:0:2860:0:(peer.c:2851:lnet_peer_data_present()) peer 192.168.2.38@tcp2(ffff9efa0c071c00)
      00000400:00000200:0.0:1596142526.905308:0:2860:0:(peer.c:2771:lnet_peer_set_primary_data()) peer 192.168.2.38@tcp99(ffff9ef9de8dce00)
      00000400:00000200:0.0:1596142526.905311:0:2860:0:(peer.c:1931:lnet_peer_queue_for_discovery()) Queue peer 192.168.2.38@tcp99: -114
      

      After reconciling the two peers in lnet_peer_set_primary_data(), we merge the info in pbuf with the peer object for primary NID 192.168.2.38@tcp99.

      00000400:00000200:0.0:1596142526.905323:0:2860:0:(peer.c:2622:lnet_peer_merge_data()) peer 192.168.2.38@tcp99(ffff9ef9de8dce00)
      

      When we call lnet_peer_add_nid() for the 192.168.2.38@tcp2, that function sees the existing lpni for that NID, and sees that the NID is "primary" for that peer object so it deletes the peer, and the peer NI, and creates a new lpni for the 192.168.2.38@tcp2 NID.

      00000400:00000200:0.0:1596142526.905333:0:2860:0:(peer.c:457:lnet_peer_del_locked()) peer 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.905337:0:2860:0:(peer.c:377:lnet_peer_detach_peer_ni_locked()) peer 192.168.2.38@tcp2 NID 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.905341:0:2860:0:(peer.c:202:lnet_peer_ni_alloc()) ffff9efa0c550800 nid 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.905344:0:2860:0:(peer.c:220:lnet_peer_net_alloc()) ffff9efa0d340d40 net tcp2
      

      But, LNetPrimaryNID still has reference on the original lpni for 192.168.2.38@tcp2 NID, and it is going to use that lpni to lookup the primary NID of the peer when discovery completes.

      Below we can see that after discovery completes, LNetPrimaryNID releases its reference on the lpni which allows the peer NI, peer net and peer objects to be freed. LNetPrimaryNID then returns the wrong nid for this peer.

      ...
      00000400:00000200:0.0:1596142526.905414:0:2860:0:(peer.c:1949:lnet_peer_discovery_complete()) Discovery complete. Dequeue peer 192.168.2.38@tcp2
      ...
      00000400:00000200:0.0:1596142526.905518:0:3881:0:(peer.c:2292:lnet_discover_peer_locked()) peer 192.168.2.38@tcp2 NID 192.168.2.38@tcp2: -113. discovery complete
      00000400:00000200:0.0:1596142526.905522:0:3881:0:(peer.c:1721:lnet_destroy_peer_ni_locked()) ffff9efa0c550000 nid 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.905526:0:3881:0:(peer.c:230:lnet_destroy_peer_net_locked()) ffff9efa22389a40 net tcp2
      00000400:00000200:0.0:1596142526.905529:0:3881:0:(peer.c:293:lnet_destroy_peer_locked()) ffff9efa0c071c00 nid 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.905535:0:3881:0:(peer.c:1232:LNetPrimaryNID()) NID 192.168.2.38@tcp2 primary NID 192.168.2.38@tcp2 rc -113
      

      Here's that code in LNetPrimaryNID():

                      rc = lnet_discover_peer_locked(lpni, cpt, true);
                      if (rc)
                              goto out_decref;
                      lp = lpni->lpni_peer_net->lpn_peer;
      
                      /* Only try once if discovery is disabled */
                      if (lnet_is_discovery_disabled(lp))
                              break;
              }
              primary_nid = lp->lp_primary_nid;
      out_decref:
              lnet_peer_ni_decref_locked(lpni);
      out_unlock:
              lnet_net_unlock(cpt);
      
              CDEBUG(D_NET, "NID %s primary NID %s rc %d\n", libcfs_nid2str(nid),
                     libcfs_nid2str(primary_nid), rc);
              return primary_nid;
      

      I can think of a couple solutions:
      1. Modify lnet_peer_add_nid() so that it will maintain an existing lpni if it finds one for the NID it is trying to add. This might not be workable because lnet_create_reply_msg() currently derefs the lpni->lpni_peer_net->lpn_peer hierarchy without holding a net lock, and with this proposed change that hierarchy is not safe to read w/o holding the net lock. I don't think we want to introduce net lock into that function.

      2. Modify LNetPrimaryNID() (and probably lnet_discover_peer_locked() [Edit: It looks like all callers of lnet_discover_peer_locked() would also need to be adjusted]) so that they do not assume lpni is persistent across discovery.

      Attachments

        Activity

          [LU-13883] LNetPrimaryNID assumes lpni for a particular NID will not change through discovery but lnet_peer_add_nid() may allocate a new one for it.

          Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/39747
          Subject: LU-13883 lnet: Lookup lpni after discovery
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 07f38014d1aebcff351fca6df358825ba045087e

          gerrit Gerrit Updater added a comment - Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/39747 Subject: LU-13883 lnet: Lookup lpni after discovery Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 07f38014d1aebcff351fca6df358825ba045087e

          Hi,

          Am I right to assume that this LU is related to LNet's multirail feature?
          Is it OK with everyone if I update the ticket's tags to reflect this?

          eaujames Etienne Aujames added a comment - Hi, Am I right to assume that this LU is related to LNet's multirail feature? Is it OK with everyone if I update the ticket's tags to reflect this?

          Hi Chris, 

          I only got around to this now, so I don't have anything that would be in conflict with your proposal.

          Thanks,

          Serguei.

          ssmirnov Serguei Smirnov added a comment - Hi Chris,  I only got around to this now, so I don't have anything that would be in conflict with your proposal. Thanks, Serguei.

          Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/39650
          Subject: LU-13883 lnet: Refactor lnet_discover_peer_locked
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: a9759f16cf72e87c7ae9400ce40c4d13f9c3b5b6

          gerrit Gerrit Updater added a comment - Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/39650 Subject: LU-13883 lnet: Refactor lnet_discover_peer_locked Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: a9759f16cf72e87c7ae9400ce40c4d13f9c3b5b6
          hornc Chris Horn added a comment -

          Also note that there are other issues with this particular use-case that can still cause LNetPrimaryNID to return the wrong NID. The patch in this ticket is part of series which addresses those other issues.

          hornc Chris Horn added a comment - Also note that there are other issues with this particular use-case that can still cause LNetPrimaryNID to return the wrong NID. The patch in this ticket is part of series which addresses those other issues.
          hornc Chris Horn added a comment - - edited

          The patch implements the idea in #2 of the description, but only for LNetPrimaryNID() and lnet_discover(). I don't think we need to adjust lnet_initiate_peer_discovery()... but I could be mistaken.

          hornc Chris Horn added a comment - - edited The patch implements the idea in #2 of the description, but only for LNetPrimaryNID() and lnet_discover(). I don't think we need to adjust lnet_initiate_peer_discovery()... but I could be mistaken.

          Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/39606
          Subject: LU-13883 lnet: Lookup lpni after discovery
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: ea64792a4e55685fdaa0b9b5260d0fbdcdb43587

          gerrit Gerrit Updater added a comment - Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/39606 Subject: LU-13883 lnet: Lookup lpni after discovery Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ea64792a4e55685fdaa0b9b5260d0fbdcdb43587
          hornc Chris Horn added a comment -

          ssmirnov I've been working on a series of patches related to this issue. Do you have any code in progress for it already or just started looking at it now?

          hornc Chris Horn added a comment - ssmirnov I've been working on a series of patches related to this issue. Do you have any code in progress for it already or just started looking at it now?

          People

            ssmirnov Serguei Smirnov
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: