[LU-11472] LNet Health: Decrement health value on response timeout Created: 04/Oct/18  Updated: 02/Nov/18  Resolved: 02/Nov/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0
Fix Version/s: Lustre 2.12.0

Type: Bug Priority: Major
Reporter: Amir Shehata (Inactive) Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: lnet-health

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

When a response times out we want to decrement the health of the immediate next hop peer ni, so we don't use that interface if there are others available.



 Comments   
Comment by Amir Shehata (Inactive) [ 04/Oct/18 ]

I have a patch which I'll commit shortly. However, although this is going to work for directly connected. The behavior might be an issue for routing. If the route is servicing multiple connected peer through the same route. If one of the final destinations has a problem and it doesn't respond then the router interface will be dinged. It'll be put on the recovery queue, and recover, but during that period of time the route will be down.

Comment by Gerrit Updater [ 05/Oct/18 ]

Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33295
Subject: LU-11472 lnet: Decrement health on timeout
Project: fs/lustre-release
Branch: multi-rail
Current Patch Set: 1
Commit: 9df50755373be42b64f640e664f1e05690f37531

Comment by Gerrit Updater [ 05/Oct/18 ]

Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33308
Subject: LU-11472 lnet: Decrement health on timeout
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 56b5ef7e5c7a1d0c7aca23504acb2b2bc5862199

Comment by Gerrit Updater [ 02/Nov/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33308/
Subject: LU-11472 lnet: Decrement health on timeout
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 139d69141b73d427490f39d3096b2187e979eaea

Comment by Peter Jones [ 02/Nov/18 ]

Landed for 2.12

Generated at Sat Feb 10 02:44:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.