[LU-12288] Preferred flag of route selection policy does not work Created: 13/May/19 Updated: 12/Nov/20 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Tatsushi Takamura | Assignee: | Tatsushi Takamura |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Route selection is detemined according to circumstances(Preferred, Health Value, Credits and Seq Number) and Preferred is top priority.
/*
* Look at the peer NIs for the destination peer that connect
* to the chosen net. If a peer_ni is preferred when using the
* best_ni to communicate, we use that one. If there is no
* preferred peer_ni, or there are multiple preferred peer_ni,
* the available transmit credits are used. If the transmit
* credits are equal, we round-robin over the peer_ni.
*/
But, If there are more than 2 peers and the first peer is Preffered, there are cases where the Preffered peer(the first one) is not selected(Preffered flag is ignored).
lnet_select_peer_ni()
/* pick the healthiest peer ni */
if (lpni_healthv < best_lpni_healthv) {
continue;
} else if (lpni_healthv > best_lpni_healthv) {
best_lpni_healthv = lpni_healthv; // peer1(supporse ni_is_pref), but preferred flag not be set
/* if this is a preferred peer use it */
} else if (!preferred && ni_is_pref) {
preferred = true;
} else if (preferred && !ni_is_pref) {
continue;
} else if (lpni->lpni_txcredits < best_lpni_credits) { // peer2 is judged by another metrics
We fixed that route selection is in the following order.
|
| Comments |
| Comment by Amir Shehata (Inactive) [ 16/May/19 ] |
|
The intended design is to always have health take precedence. In this way the healthiest interface is always used. Would there be a scenario where we should use the preferred interface, even though it's not healthy, while another healthier interface can be used? |
| Comment by Tatsushi Takamura [ 30/Aug/19 ] |
|
Amir Shehata, Sorry, the late replay. Suppose there are 2 preferred routes as follows(ni_is_pref is 1 and both of healthv are same value):
00000400:00000200:13.0:1539748979.363069:0:14461:0:(lib-move.c:1755:lnet_select_peer_ni()) 192.168.128.202@o2ib[ffff880bf773a400]->192.168.128.201@o2ib[ffff88060a785c00] ni_is_pref = 1, healthv = 1000 00000400:00000200:13.0:1539748979.363072:0:14461:0:(lib-move.c:1755:lnet_select_peer_ni()) 192.168.128.202@o2ib[ffff880bf773a400]->192.168.130.201@o2ib[ffff880c1fa97e00] ni_is_pref = 1, healthv = 1000
/* pick the healthiest peer ni */ if (lpni_healthv < best_lpni_healthv) { continue; } else if (lpni_healthv > best_lpni_healthv) { best_lpni_healthv = lpni_healthv; //the first route is selected temporarily, but preferred flag is not set true /* if this is a preferred peer use it */ } else if (!preferred && ni_is_pref) { preferred = true; preferred flag is set true and the second route is selected // So, the first route is never selected.
I'll post the patch soon. Could you see it? |
| Comment by Gerrit Updater [ 30/Aug/19 ] |
|
Tatsushi Takamura (takamr.tatsushi@jp.fujitsu.com) uploaded a new patch: https://review.whamcloud.com/36002 |
| Comment by Gerrit Updater [ 12/Nov/20 ] |
|
Chris Horn (chris.horn@hpe.com) uploaded a new patch: https://review.whamcloud.com/40635 |