Align LNet routing with Multi-Rail and LNet health (LU-11297)

[LU-12249] LNet Health: list corruption Created: 30/Apr/19  Updated: 04/Oct/19  Resolved: 10/Jun/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.13.0, Lustre 2.12.3

Type: Technical task Priority: Minor
Reporter: Amir Shehata (Inactive) Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Rank (Obsolete): 9223372036854775807

 Description   

When shutting down the resend queues are cleared and freed. The monitor thread state is set to shutdown. It is possible to get lnet_finalize() called after the queues are freed. The code checks for ln_state to see if we're shutting down. But in this case we should really be checking ln_mt_state. The monitor thread is the one that matters in this case, because it's the one which allocates and frees the resend queues.



 Comments   
Comment by Gerrit Updater [ 07/Jun/19 ]

Amir Shehata (ashehata@whamcloud.com) merged in patch https://review.whamcloud.com/34778/
Subject: LU-12249 lnet: fix list corruption
Project: fs/lustre-release
Branch: multi-rail
Current Patch Set:
Commit: d799ac910cd6c980b40c81b76eaefb65b88904d0

Comment by Joseph Gmitter (Inactive) [ 10/Jun/19 ]

Work has landed as part of the MR Routing merge commit: https://review.whamcloud.com/#/c/34983/

Comment by Gerrit Updater [ 03/Sep/19 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36035
Subject: LU-12249 lnet: fix list corruption
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 202f602500a63a9e8cf9aab2cd9faf6ed2e56c93

Comment by Gerrit Updater [ 04/Oct/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36035/
Subject: LU-12249 lnet: fix list corruption
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: 6a301c5ea2a9f9d971fdc1d723aa4ce85474ce90

Generated at Sat Feb 10 02:50:56 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.