[LU-5734] LNet dynamic control: lnet_dyn_add_ni() can't clean up failed NI in some cases Created: 13/Oct/14  Updated: 07/Jan/15  Resolved: 07/Jan/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Blocker
Reporter: Isaac Huang (Inactive) Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-5568 kernel crash when when network initia... Resolved
is related to LU-2456 Dynamic LNet Config Main Development ... Resolved
is related to LU-5849 DLC: parse networks function needs to... Resolved
is related to LU-6002 DLC: startup acceptor dynamically. Resolved
is related to LU-5839 LNetNIInit() should not call lnet_des... Resolved
is related to LU-5850 DLC: lnet_startup_lndnis should clean... Resolved
Severity: 3
Rank (Obsolete): 16097

 Description   

In lnet_dyn_add_ni(), if lnet_startup_lndnis() fails, then NI on the local &net_head list is freed. But lnet_startup_lndnis(&net_head) can fail after list_del(&ni->ni_list), i.e. after the NI has been removed already from the &net_head. In this case I failed to see where the NI gets cleaned up and freed.



 Comments   
Comment by Isaac Huang (Inactive) [ 14/Oct/14 ]

It should be fixed together with LU-5568 - they are closely related.

Comment by Wang Shilong (Inactive) [ 15/Oct/14 ]

Hi lsaac Huang,

I will try to fix this issue together with LU-5568.

Thanks very much for your comments and help!
Best regards,
Wang Shilong

Comment by Isaac Huang (Inactive) [ 01/Nov/14 ]

Another related issue LU-5839, that should probably be fixed together.

Comment by Andreas Dilger [ 07/Nov/14 ]

Wang, are you working on a patch for this issue?

Comment by Wang Shilong (Inactive) [ 10/Nov/14 ]

Hi Andreas Dilger,

Previously, the fix is together with my patch, Now Amir is giving a quick fix for LU-5568.
I am going to separate enhanced patch after Amir's patch which should include this fix!

Best Regards,
Wang Shilong

Comment by Wang Shilong (Inactive) [ 11/Nov/14 ]

Amir Shehata just send this patch to fix it:

http://review.whamcloud.com/#/c/12658/

Best Regards,
Wang Shilong

Comment by Gerrit Updater [ 07/Jan/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12658/
Subject: LU-5734 lnet: improve clean up code and API
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: c9501b87d0e06c36b180b80c08ca79b672f20c72

Comment by Jodi Levi (Inactive) [ 07/Jan/15 ]

Patch landed to Master.

Generated at Sat Feb 10 01:54:03 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.