Align LNet routing with Multi-Rail and LNet health
(LU-11297)
|
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.12.3 |
| Type: | Technical task | Priority: | Minor |
| Reporter: | Amir Shehata (Inactive) | Assignee: | Amir Shehata (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | lnet-health, lnet-router | ||
| Issue Links: |
|
||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
There is a sequence number used when sending discovery messages. This sequence number is intended to detect stale messages. However it could be misleading if the peer reboots. In this case the peer's sequence number will reset. The node will think that all information being sent to it is stale, while in reality the peer might've changed configuration. There is no reliable why to know whether a peer rebooted, so we'll always assume that the messages we're receiving are valid. So we'll operate on the first come first serve basis. |
| Comments |
| Comment by Gerrit Updater [ 05/Oct/18 ] |
|
Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33304 |
| Comment by Gerrit Updater [ 07/Jun/19 ] |
|
Amir Shehata (ashehata@whamcloud.com) merged in patch https://review.whamcloud.com/33304/ |
| Comment by Joseph Gmitter (Inactive) [ 10/Jun/19 ] |
|
Work has landed as part of the MR Routing merge commit: https://review.whamcloud.com/#/c/34983/ |
| Comment by Gerrit Updater [ 03/Sep/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36041 |
| Comment by Gerrit Updater [ 08/Oct/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36041/ |