Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13883

LNetPrimaryNID assumes lpni for a particular NID will not change through discovery but lnet_peer_add_nid() may allocate a new one for it.

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      After server reboot (failover/failback test) client connnects to server using non-primary NID 192.168.2.38@tcp2. Server calls LNetPrimaryNID to setup export.

      New peer is setup and queued for discovery:

      00000020:00000040:0.0:1596142526.904195:0:3881:0:(genops.c:1009:class_export_get()) GETting export ffff9efa080c4000 : new refcount 6
      00000100:00000040:0.0:1596142526.904198:0:3881:0:(service.c:1054:ptlrpc_request_change_export()) RPC GETting export ffff9efa080c4000 : new rpc_count 1
      00000400:00000200:0.0:1596142526.904210:0:3881:0:(peer.c:285:lnet_peer_alloc()) ffff9efa0c071c00 nid 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.904215:0:3881:0:(peer.c:220:lnet_peer_net_alloc()) ffff9efa22389a40 net tcp2
      00000400:00000200:0.0:1596142526.904222:0:3881:0:(peer.c:202:lnet_peer_ni_alloc()) ffff9efa0c550000 nid 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.904228:0:3881:0:(peer.c:1329:lnet_peer_attach_peer_ni()) peer 192.168.2.38@tcp2 NID 192.168.2.38@tcp2 flags 0x0
      00000400:00000200:0.0:1596142526.904233:0:3881:0:(lib-lnet.h:96:lnet_peer_set_state()) Peer 192.168.2.38@tcp2(ffff9efa0c071c00) 8192 state 0x2000
      00000400:00000200:0.0:1596142526.904236:0:3881:0:(lib-lnet.h:96:lnet_peer_set_state()) Peer 192.168.2.38@tcp2(ffff9efa0c071c00) 16384 state 0x6000
      00000400:00000200:0.0:1596142526.904241:0:3881:0:(lib-lnet.h:96:lnet_peer_set_state()) Peer 192.168.2.38@tcp2(ffff9efa0c071c00) 64 state 0x6040
      00000400:00000200:0.0:1596142526.904249:0:3881:0:(peer.c:1931:lnet_peer_queue_for_discovery()) Queue peer 192.168.2.38@tcp2: 0
      00000400:00000200:0.0:1596142526.904251:0:3881:0:(peer.c:2244:lnet_discover_peer_locked()) Discovery attempt # 1
      

      Discovery ping is sent:

      00000400:00000200:0.0:1596142526.904351:0:2860:0:(peer.c:3342:lnet_peer_discovery_wait_for_work()) woken: 0
      00000400:00000200:0.0:1596142526.904355:0:2860:0:(peer.c:3457:lnet_peer_discovery()) peer 192.168.2.38@tcp2(ffff9efa0c071c00) state 0x6040
      00000400:00000200:0.0:1596142526.904359:0:2860:0:(peer.c:3057:lnet_peer_send_ping()) peer 192.168.2.38@tcp2(ffff9efa0c071c00)
      00000400:00000200:0.0:1596142526.904362:0:2860:0:(lib-lnet.h:96:lnet_peer_set_state()) Peer 192.168.2.38@tcp2(ffff9efa0c071c00) 512 state 0x6240
      00000400:00000200:0.0:1596142526.904365:0:2860:0:(lib-lnet.h:104:lnet_peer_clear_state()) Peer 192.168.2.38@tcp2(ffff9efa0c071c00) 8192 state 0x4240
      00000400:00000200:0.0:1596142526.904369:0:2860:0:(peer.c:3008:lnet_peer_select_nid()) peer 192.168.2.38@tcp2(ffff9efa0c071c00)
      00000400:00000200:0.0:1596142526.904381:0:2860:0:(lib-move.c:5126:LNetGet()) LNetGet -> 12345-192.168.2.38@tcp2
      

      Ping reply is processed.

      00000400:00000200:0.0:1596142526.905282:0:2860:0:(peer.c:3342:lnet_peer_discovery_wait_for_work()) woken: 0
      00000400:00000200:0.0:1596142526.905287:0:2860:0:(peer.c:3457:lnet_peer_discovery()) peer 192.168.2.38@tcp2(ffff9efa0c071c00) state 0x40c1
      

      Discovery thread looks up the primary nid in the ping buffer and it finds an existing lpni (some other request from the same client arrived from the client's 192.168.2.38@tcp99 NID).

      00000400:00000200:0.0:1596142526.905291:0:2860:0:(peer.c:2851:lnet_peer_data_present()) peer 192.168.2.38@tcp2(ffff9efa0c071c00)
      00000400:00000200:0.0:1596142526.905308:0:2860:0:(peer.c:2771:lnet_peer_set_primary_data()) peer 192.168.2.38@tcp99(ffff9ef9de8dce00)
      00000400:00000200:0.0:1596142526.905311:0:2860:0:(peer.c:1931:lnet_peer_queue_for_discovery()) Queue peer 192.168.2.38@tcp99: -114
      

      After reconciling the two peers in lnet_peer_set_primary_data(), we merge the info in pbuf with the peer object for primary NID 192.168.2.38@tcp99.

      00000400:00000200:0.0:1596142526.905323:0:2860:0:(peer.c:2622:lnet_peer_merge_data()) peer 192.168.2.38@tcp99(ffff9ef9de8dce00)
      

      When we call lnet_peer_add_nid() for the 192.168.2.38@tcp2, that function sees the existing lpni for that NID, and sees that the NID is "primary" for that peer object so it deletes the peer, and the peer NI, and creates a new lpni for the 192.168.2.38@tcp2 NID.

      00000400:00000200:0.0:1596142526.905333:0:2860:0:(peer.c:457:lnet_peer_del_locked()) peer 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.905337:0:2860:0:(peer.c:377:lnet_peer_detach_peer_ni_locked()) peer 192.168.2.38@tcp2 NID 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.905341:0:2860:0:(peer.c:202:lnet_peer_ni_alloc()) ffff9efa0c550800 nid 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.905344:0:2860:0:(peer.c:220:lnet_peer_net_alloc()) ffff9efa0d340d40 net tcp2
      

      But, LNetPrimaryNID still has reference on the original lpni for 192.168.2.38@tcp2 NID, and it is going to use that lpni to lookup the primary NID of the peer when discovery completes.

      Below we can see that after discovery completes, LNetPrimaryNID releases its reference on the lpni which allows the peer NI, peer net and peer objects to be freed. LNetPrimaryNID then returns the wrong nid for this peer.

      ...
      00000400:00000200:0.0:1596142526.905414:0:2860:0:(peer.c:1949:lnet_peer_discovery_complete()) Discovery complete. Dequeue peer 192.168.2.38@tcp2
      ...
      00000400:00000200:0.0:1596142526.905518:0:3881:0:(peer.c:2292:lnet_discover_peer_locked()) peer 192.168.2.38@tcp2 NID 192.168.2.38@tcp2: -113. discovery complete
      00000400:00000200:0.0:1596142526.905522:0:3881:0:(peer.c:1721:lnet_destroy_peer_ni_locked()) ffff9efa0c550000 nid 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.905526:0:3881:0:(peer.c:230:lnet_destroy_peer_net_locked()) ffff9efa22389a40 net tcp2
      00000400:00000200:0.0:1596142526.905529:0:3881:0:(peer.c:293:lnet_destroy_peer_locked()) ffff9efa0c071c00 nid 192.168.2.38@tcp2
      00000400:00000200:0.0:1596142526.905535:0:3881:0:(peer.c:1232:LNetPrimaryNID()) NID 192.168.2.38@tcp2 primary NID 192.168.2.38@tcp2 rc -113
      

      Here's that code in LNetPrimaryNID():

                      rc = lnet_discover_peer_locked(lpni, cpt, true);
                      if (rc)
                              goto out_decref;
                      lp = lpni->lpni_peer_net->lpn_peer;
      
                      /* Only try once if discovery is disabled */
                      if (lnet_is_discovery_disabled(lp))
                              break;
              }
              primary_nid = lp->lp_primary_nid;
      out_decref:
              lnet_peer_ni_decref_locked(lpni);
      out_unlock:
              lnet_net_unlock(cpt);
      
              CDEBUG(D_NET, "NID %s primary NID %s rc %d\n", libcfs_nid2str(nid),
                     libcfs_nid2str(primary_nid), rc);
              return primary_nid;
      

      I can think of a couple solutions:
      1. Modify lnet_peer_add_nid() so that it will maintain an existing lpni if it finds one for the NID it is trying to add. This might not be workable because lnet_create_reply_msg() currently derefs the lpni->lpni_peer_net->lpn_peer hierarchy without holding a net lock, and with this proposed change that hierarchy is not safe to read w/o holding the net lock. I don't think we want to introduce net lock into that function.

      2. Modify LNetPrimaryNID() (and probably lnet_discover_peer_locked() [Edit: It looks like all callers of lnet_discover_peer_locked() would also need to be adjusted]) so that they do not assume lpni is persistent across discovery.

      Attachments

        Activity

          People

            ssmirnov Serguei Smirnov
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: