[LU-13923] LNet: lnetctl "lnet unconfigure" or "net del" hangs if executed on a gateway Created: 25/Aug/20 Updated: 24/Mar/22 Resolved: 24/Mar/22 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Serguei Smirnov | Assignee: | Serguei Smirnov |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Epic/Theme: | lnet | ||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Steps to reproduce: Configure 3 nodes: PeerA (tcp), PeerB (tcp1) and GW1 (tcp, tcp1). Configure GW1 to act as a router and add corresponding routes to PeerA/PeerB. Verify connectivity between PeerA and PeerB by executing "lnetctl ping" back and forth. Execute "lnetctl lnet unconfigure" on GW1. The command hangs. The issue affects tag 2.13.55 |
| Comments |
| Comment by Serguei Smirnov [ 25/Aug/20 ] |
|
The issue has been introduced by https://review.whamcloud.com/38798 The change added a call to lnet_nid2peerni_locked which is incrementing a reference on a peer_ni object which is never decremented. This results in LNet being unable to cleanup properly on "unconfigure" or "net del" operation, causing lnetctl to hang. |
| Comment by Gerrit Updater [ 25/Aug/20 ] |
|
Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39731 |
| Comment by Chris Horn [ 24/Mar/22 ] |
|
This ticket is a duplicate of |