[LU-13277] Potential deadlock in lnet_peer_discovery Created: 20/Feb/20 Updated: 05/Mar/20 Resolved: 05/Mar/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Chris Horn | Assignee: | Chris Horn |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Potential deadlock here when LNet is shutting down: static int lnet_peer_discovery(void *arg)
...
for (;;) {
...
lnet_net_lock(LNET_LOCK_EX);
if (the_lnet.ln_dc_state == LNET_DC_STATE_STOPPING)
break;
...
}
CDEBUG(D_NET, "stopping\n");
/*
* Clean up before telling lnet_peer_discovery_stop() that
* we're done. Use wake_up() below to somewhat reduce the
* size of the thundering herd if there are multiple threads
* waiting on discovery of a single peer.
*/
/* Queue cleanup 1: stop all pending pings and pushes. */
lnet_net_lock(LNET_LOCK_EX); <<< Deadlock
...
|
| Comments |
| Comment by Gerrit Updater [ 21/Feb/20 ] |
|
Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/37675 |
| Comment by Gerrit Updater [ 05/Mar/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37675/ |
| Comment by Peter Jones [ 05/Mar/20 ] |
|
Landed for 2.14 |