[LU-11756] kib_conn leak Created: 11/Dec/18  Updated: 08/Oct/19  Resolved: 30/Apr/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.13.0, Lustre 2.12.3

Type: Bug Priority: Minor
Reporter: Andriy Skulysh Assignee: Andriy Skulysh
Resolution: Fixed Votes: 0
Labels: patch

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
crash>kib_conn_t.ibc_peer,ibc_refcount,ibc_list,ibc_state,ibc_comms_error,ibc_tx_queue_nocred 0xffff8808c0612e00
  ibc_peer = 0xffff880f0e2a2780
  ibc_refcount = {
    counter = 1
  }
  ibc_list = {
    next = 0xdead000000100100, 
    prev = 0xdead000000200200
  }
  ibc_state = 5
  ibc_comms_error = -12
  ibc_tx_queue_nocred = {
    next = 0xffffc900200de3e0, 
    prev = 0xffffc900200de3e0
  }

A tx was queued by a race into ibc_tx_queue_nocred while disconnecting connection. So the connection stays in unused state but it can't be destroyed because it is referenced by the tx.
It results inĀ 

[18891797.073780] Lustre: 1572:0:(niobuf.c:292:ptlrpc_abort_bulk()) Unexpectedly long timeout: desc ffff880a4eaf1c00


 Comments   
Comment by Gerrit Updater [ 11/Dec/18 ]

Andriy Skulysh (c17819@cray.com) uploaded a new patch: https://review.whamcloud.com/33828
Subject: LU-11756 o2iblnd: kib_conn leak
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 23da5af6f503bda38e70cbf8d82922676ff9bf0c

Comment by Chris Horn [ 04/Apr/19 ]

I'm hitting this pretty regularly with master

Comment by James A Simmons [ 26/Apr/19 ]

Its in master-next so it should be landing soon

Comment by Gerrit Updater [ 30/Apr/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33828/
Subject: LU-11756 o2iblnd: kib_conn leak
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: a155c3fca38d2a3092f9b5d116ad7877d51d1db1

Comment by Peter Jones [ 30/Apr/19 ]

Landed for 2.13

Comment by Gerrit Updater [ 01/Oct/19 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36347
Subject: LU-11756 o2iblnd: kib_conn leak
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: d69377151754b5497e332fe3102dc581fd970336

Comment by Gerrit Updater [ 08/Oct/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36347/
Subject: LU-11756 o2iblnd: kib_conn leak
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: bb9644c360f4d54d2a2568f94ba8ae94489a873f

Generated at Sat Feb 10 02:46:39 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.