[LU-6010] DLC: LNetFini() assert is hit if lustre_rmmod without bringing down NI Created: 09/Dec/14  Updated: 08/Feb/15  Resolved: 08/Feb/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Critical
Reporter: Amir Shehata (Inactive) Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: HB

Severity: 3
Rank (Obsolete): 16752

 Description   

if lnet is not brought down properly (IE: lctl network down or lnetctl lnet unconfigure), then LNetFini():LASSERT(the_lnet.ln_refcount == 0); is hit. We need to modify this assert to account for this scenario. This wasn't a problem before because a network was always loaded irregardless. and LNetNIFini() was always called to set ln_refcoutn to 0



 Comments   
Comment by Amir Shehata (Inactive) [ 17/Dec/14 ]

Here are the use cases for loading/unloading LNet, and which ones correspond to this issue:

Use Case 1
Configure LNet with no mod NI -> initializes NI
Dynamically add NIs
Unconfigure LNet -> uninitializes NI
lustre_rmmod

Use Case 2
Configure LNet with no mod NI
Unconfigure LNet
lustre_rmmod

Use Case 3
Configure LNet with no mod NI
lustre_rmmod
--> LU-6010

Use Case 4
Configure LNet with mod NI
Dynamically add NIs
Unconfigure LNet
lustre_rmmod

Use Case 5
Configure LNet with mod NI
Dynamically del all NIs
Unconfigure LNet
lustre_rmmod

Use Case 6
Configure LNet with mod NI
Dynamically del all NIs
lustre_rmmod
--> LU-6010

MODULE and LNet interaction
-> can not dynamically add NIs from a different module has to be from IOCTL

Use Case 7
Module X: Configure LNet with mod NI
Module Y: Configure LNet with mod NI
MODULE X: Unconfigure LNet
MODULE Y: Unconfigure LNet
lustre_rmmod

Use Case 8
Module X: Configure LNet with mod NI
Module Y: Configure LNet with mod NI
MODULE Y: Unconfigure LNet
lustre_rmmod
 -> lnet in use error

Use Case 9
Module X: Configure LNet with mod NI
Module Y: Configure LNet with mod NI
lustre_rmmod
 -> lnet in use error
Comment by Gerrit Updater [ 17/Dec/14 ]

Amir Shehata (amir.shehata@intel.com) uploaded a new patch: http://review.whamcloud.com/13110
Subject: LU-6010 lnet: prevent assert on LNet module unload
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 1262d8f1188c3b610354908a0049a8effe0af314

Comment by Isaac Huang (Inactive) [ 06/Jan/15 ]

Isn't the "Use Case 3/6" illegal uses of DLC? The user should NOT be allowed to remove the lnet module if DLC configure has been called successfully but DLC unconfigure has not been called yet.

I think the correct action for "Use Case 3/6" is to prevent the lnet module to be unloaded - i.e. to enforce the correct DLC usage in "Use Case 5". In other words, it should be handled the same way as in "Use Case 8", fail with "lnet in use error".

Comment by Gerrit Updater [ 08/Feb/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13110/
Subject: LU-6010 lnet: prevent assert on LNet module unload
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: f1a2e6107c124d010d89973cfd716fbd17b689f0

Comment by Peter Jones [ 08/Feb/15 ]

Landed for 2.7

Generated at Sat Feb 10 01:56:25 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.