[LU-14742] socklnd: detect link down and mark ni as unusable Created: 07/Jun/21  Updated: 20/Jul/21  Resolved: 21/Jun/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Improvement Priority: Minor
Reporter: Serguei Smirnov Assignee: Serguei Smirnov
Resolution: Fixed Votes: 0
Labels: ksocklnd, lnet, lnet-health

Issue Links:
Related
Rank (Obsolete): 9223372036854775807

 Description   

To help avoid selecting lnet ni which corresponds to a downed ethernet link for sending, add a mechanism for detecting link events in socklnd. On link up/down events, find corresponding ni and set/clear ni_fatal_error_on flag, similar to how o2iblnd does it.



 Comments   
Comment by Gerrit Updater [ 08/Jun/21 ]

Serguei Smirnov (ssmirnov@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43952
Subject: LU-14742 socklnd: detect link state to set fatal error on ni
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5e443444449c98b58f2199f6f55589126cc3eb37

Comment by Gerrit Updater [ 21/Jun/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43952/
Subject: LU-14742 socklnd: detect link state to set fatal error on ni
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: fc2df80e96dc5db9f3fb710893ccf6f442664471

Comment by Peter Jones [ 21/Jun/21 ]

Landed for 2.15

Generated at Sat Feb 10 03:12:23 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.