Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
While using an LUTF script which deletes a net dynamically via
lnetctl import --del config.yaml
We ran into a dead lock.
It appears like a local NI is in recovery. A message is about to get sent to this NI. We call lnet_nid2peerni_locked() which locks the mutex. However this mutex is held by the process trying to delete the net, leading to a dead lock.
The code should first clean up the recovery queue. IE remove all instances of this NI from the recovery queue before deleting the NI.
A similar processing is done in LNetNIFini()
Attachments
Issue Links
- is related to
-
LU-12233 Deadlock on LNet shutdown
-
- Resolved
-