[LU-5570] Better router selection in LNet - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Fixed
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Labels:
None

Rank (Obsolete):
15533

Description

LNet chooses routers based on queued bytes on routers, at the meanwhile, it normally takes tens of seconds to detect dead routers (we see failed completion event of outstanding tx/rx, then close connection and notify LNet peer is dead) , which means it is still possible to queue more messages to a potentially dead router if all other alive routers have long message queue.

we may need to check aliveness timestamp as part of router evaluation, and avoid to choose those routers that are inactive for certain number of seconds as long as there are other active routers (it takes pretty long to mark a router as dead, we might prefer not to choose it before marking it as dead)

Attachments

Issue Links

is related to

LU-7734 LNet Multi-Rail Project

Resolved

mentioned in: Page No Confluence page found with the given URL.

Activity

[LU-5570] Better router selection in LNet

Gerrit Updater added a comment - 09/Jan/15 1:34 AM

Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: http://review.whamcloud.com/13302
Subject: Revert "~~LU-5570~~ lnet: check router aliveness timestamp"
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2d8b8c9e0149b0fe860983cd2020d9781bd2e548

Gerrit Updater added a comment - 09/Jan/15 1:34 AM Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: http://review.whamcloud.com/13302 Subject: Revert " LU-5570 lnet: check router aliveness timestamp" Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 2d8b8c9e0149b0fe860983cd2020d9781bd2e548

Jodi Levi (Inactive) added a comment - 08/Jan/15 1:54 PM

Patch landed to Master. If there is more work to be done in this ticket, please reopen the ticket.

Jodi Levi (Inactive) added a comment - 08/Jan/15 1:54 PM Patch landed to Master. If there is more work to be done in this ticket, please reopen the ticket.

Gerrit Updater added a comment - 04/Jan/15 6:33 PM

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/11748/
Subject: ~~LU-5570~~ lnet: check router aliveness timestamp
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 339c7b2b784a528f41c432e9b90285d3445b7536

Gerrit Updater added a comment - 04/Jan/15 6:33 PM Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/11748/ Subject: LU-5570 lnet: check router aliveness timestamp Project: fs/lustre-release Branch: master Current Patch Set: Commit: 339c7b2b784a528f41c432e9b90285d3445b7536

Alexey Lyashkov added a comment - 11/Sep/14 12:36 PM

it's not true, i have a lots crash dumps with negative credits per destination when router dead.

Alexey Lyashkov added a comment - 11/Sep/14 12:36 PM it's not true, i have a lots crash dumps with negative credits per destination when router dead.

Liang Zhen (Inactive) added a comment - 11/Sep/14 12:34 PM

In current lnet, if there is any router has positive credit, we will queue message to it not to router with negative credits.

Liang Zhen (Inactive) added a comment - 11/Sep/14 12:34 PM In current lnet, if there is any router has positive credit, we will queue message to it not to router with negative credits.

Alexey Lyashkov added a comment - 11/Sep/14 7:58 AM

I think we should don't queue any messages in case negative credits. In that case we will queue only data able to send and easy to put other data to the different routers.

Alexey Lyashkov added a comment - 11/Sep/14 7:58 AM I think we should don't queue any messages in case negative credits. In that case we will queue only data able to send and easy to put other data to the different routers.

Liang Zhen (Inactive) added a comment - 11/Sep/14 3:50 AM - edited

I would think we should always avoid configuration changes when it's possible, we already have too many tunables which is overkill and very hard for users to make them all corrects.

I totally agree it 's better to fix ~~LU-5485~~ in this patch (sorry I missed your comment there), and RC ping reduction is also a very good idea, I will have a follow-on patch to implement ping reduction, as it may requires a little more changes to a few different timestamps, so it could be clear to have a separate patch, thanks.

Liang Zhen (Inactive) added a comment - 11/Sep/14 3:50 AM - edited I would think we should always avoid configuration changes when it's possible, we already have too many tunables which is overkill and very hard for users to make them all corrects. I totally agree it 's better to fix LU-5485 in this patch (sorry I missed your comment there), and RC ping reduction is also a very good idea, I will have a follow-on patch to implement ping reduction, as it may requires a little more changes to a few different timestamps, so it could be clear to have a separate patch, thanks.

Isaac Huang (Inactive) added a comment - 09/Sep/14 4:52 PM

Also, with aliveness for routers, it'd be possible to fix ~~LU-5485~~ as well. Better plan them all together.

Isaac Huang (Inactive) added a comment - 09/Sep/14 4:52 PM Also, with aliveness for routers, it'd be possible to fix LU-5485 as well. Better plan them all together.

Isaac Huang (Inactive) added a comment - 09/Sep/14 4:48 PM

The dead_router_check_interval can be changed at run time, not a big deal really. My point is, the added code complexity seemed to out weight the benefits from the patch. If you want to go forward, why not also avoid unnecessary pings (e.g. no need to ping if last_alive is very recent) - then the additional benefit of reduced pings would make it more worthwhile.

Isaac Huang (Inactive) added a comment - 09/Sep/14 4:48 PM The dead_router_check_interval can be changed at run time, not a big deal really. My point is, the added code complexity seemed to out weight the benefits from the patch. If you want to go forward, why not also avoid unnecessary pings (e.g. no need to ping if last_alive is very recent) - then the additional benefit of reduced pings would make it more worthwhile.

Liang Zhen (Inactive) added a comment - 09/Sep/14 4:12 AM

but when we finalise message and return credit, we may drop queued message because router is down? hmm... if without this patch, then this is only happen on router, so you are correct, this is not an issue.
Anyway, I agree changing configuration may prevent upper layer to deliver more messages to potentially dead router, but it may take a few while to recover correct status even it is a false dead (dead_router_check_interval), also, user may not notice that they have to change their configuration. If we have aliveness status for routers, user don't have to change anything?

Liang Zhen (Inactive) added a comment - 09/Sep/14 4:12 AM but when we finalise message and return credit, we may drop queued message because router is down? hmm... if without this patch, then this is only happen on router, so you are correct, this is not an issue. Anyway, I agree changing configuration may prevent upper layer to deliver more messages to potentially dead router, but it may take a few while to recover correct status even it is a false dead (dead_router_check_interval), also, user may not notice that they have to change their configuration. If we have aliveness status for routers, user don't have to change anything?

Isaac Huang (Inactive) added a comment - 09/Sep/14 3:46 AM

In case of a false dead router due to low timeout, we don't abandon messages already queued to it, do we? I thought we just stop giving it more messages, and leave those already queued intact.

Isaac Huang (Inactive) added a comment - 09/Sep/14 3:46 AM In case of a false dead router due to low timeout, we don't abandon messages already queued to it, do we? I thought we just stop giving it more messages, and leave those already queued intact.

People

Assignee:: Liang Zhen (Inactive)

Reporter:: Liang Zhen (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 02/Sep/14 7:13 AM

Updated:: 13/Feb/19 7:37 AM

Resolved:: 13/Feb/19 7:36 AM