Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
3
-
9223372036854775807
Description
In lnet_discover_peer_locked() after the loop we unlock and relock the LNet cpt lock.
lnet_net_lock(LNET_LOCK_EX);
lnet_peer_decref_locked(lp);
/* Peer may have changed */
lp = lpni->lpni_peer_net->lpn_peer;
}
finish_wait(&lp->lp_dc_waitq, &wait);
lnet_net_unlock(LNET_LOCK_EX);
lnet_net_lock(cpt);
if (signal_pending(current))
rc = -EINTR;
else if (the_lnet.ln_dc_state != LNET_DC_STATE_RUNNING)
rc = -ESHUTDOWN;
else if (lp->lp_dc_error)
rc = lp->lp_dc_error;
else if (!block)
CDEBUG(D_NET, "non-blocking discovery\n");
else if (!lnet_peer_is_uptodate(lp))
goto again;
CDEBUG(D_NET, "peer %s NID %s: %d. %s\n",
(lp ? libcfs_nid2str(lp->lp_primary_nid) : "(none)"),
libcfs_nid2str(lpni->lpni_nid), rc,
(!block) ? "pending discovery" : "discovery complete");
return rc;
After relocking lp may be invalid and we need to refresh it from lpni. Or move the unlock and lock down and adjust the again label. Do we need LNET_LOCK_EX to access lp?
Attachments
Issue Links
- is related to
-
LU-10281 conf-sanity: test_54a hung at lnet_discover_peer_locked()
-
- Open
-