[LU-11756] kib_conn leak Created: 11/Dec/18 Updated: 08/Oct/19 Resolved: 30/Apr/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.12.3 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Andriy Skulysh | Assignee: | Andriy Skulysh |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
crash>kib_conn_t.ibc_peer,ibc_refcount,ibc_list,ibc_state,ibc_comms_error,ibc_tx_queue_nocred 0xffff8808c0612e00
ibc_peer = 0xffff880f0e2a2780
ibc_refcount = {
counter = 1
}
ibc_list = {
next = 0xdead000000100100,
prev = 0xdead000000200200
}
ibc_state = 5
ibc_comms_error = -12
ibc_tx_queue_nocred = {
next = 0xffffc900200de3e0,
prev = 0xffffc900200de3e0
}
A tx was queued by a race into ibc_tx_queue_nocred while disconnecting connection. So the connection stays in unused state but it can't be destroyed because it is referenced by the tx. [18891797.073780] Lustre: 1572:0:(niobuf.c:292:ptlrpc_abort_bulk()) Unexpectedly long timeout: desc ffff880a4eaf1c00 |
| Comments |
| Comment by Gerrit Updater [ 11/Dec/18 ] |
|
Andriy Skulysh (c17819@cray.com) uploaded a new patch: https://review.whamcloud.com/33828 |
| Comment by Chris Horn [ 04/Apr/19 ] |
|
I'm hitting this pretty regularly with master |
| Comment by James A Simmons [ 26/Apr/19 ] |
|
Its in master-next so it should be landing soon |
| Comment by Gerrit Updater [ 30/Apr/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33828/ |
| Comment by Peter Jones [ 30/Apr/19 ] |
|
Landed for 2.13 |
| Comment by Gerrit Updater [ 01/Oct/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36347 |
| Comment by Gerrit Updater [ 08/Oct/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36347/ |