Align LNet routing with Multi-Rail and LNet health (LU-11297)

[LU-12254] lnet: correct discovery event queue release Created: 30/Apr/19  Updated: 04/Oct/19  Resolved: 10/Jun/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.13.0, Lustre 2.12.3

Type: Technical task Priority: Minor
Reporter: Amir Shehata (Inactive) Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Rank (Obsolete): 9223372036854775807

 Description   

It is possibly for events to have a ref count on the event queue when it's released in the discovery thread. When shutting down we can see an error output:

LNetError: 2770:0:(api-ni.c:1004:lnet_res_container_cleanup()) 1 active elements on exit of EQ container
LNetError: 2807:0:(module.c:655:libcfs_exit()) Portals memory leaked: 304 bytes

The eq needs to be freed after all events have been processed.



 Comments   
Comment by Gerrit Updater [ 02/May/19 ]

Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34796
Subject: LU-12254 lnet: correct discovery LNetEQFree()
Project: fs/lustre-release
Branch: multi-rail
Current Patch Set: 1
Commit: ea1ec208b908863bc2fa803df4837aee04a33e13

Comment by Gerrit Updater [ 07/Jun/19 ]

Amir Shehata (ashehata@whamcloud.com) merged in patch https://review.whamcloud.com/34796/
Subject: LU-12254 lnet: correct discovery LNetEQFree()
Project: fs/lustre-release
Branch: multi-rail
Current Patch Set:
Commit: a0879b5985b41f92dede96e7f27623eb72102b15

Comment by Joseph Gmitter (Inactive) [ 10/Jun/19 ]

Work has landed as part of the MR Routing merge commit: https://review.whamcloud.com/#/c/34983/

Comment by Gerrit Updater [ 03/Sep/19 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36036
Subject: LU-12254 lnet: correct discovery LNetEQFree()
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: f7de523811c1259fb5faa7b6d47a2033cafd3124

Comment by Gerrit Updater [ 04/Oct/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36036/
Subject: LU-12254 lnet: correct discovery LNetEQFree()
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: 9c2d72073319ccffe39bdb72f22d79e4a43499ab

Generated at Sat Feb 10 02:50:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.