[LU-12440] Correct lnet_is_health_check when msg status is 0 Created: 15/Jun/19 Updated: 30/Jul/19 Resolved: 30/Jul/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.13.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Chris Horn | Assignee: | Chris Horn |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
I don't think this leads to misbehavior, hence this ticket is not a "Bug", but when we're sending to the lolnd neither msg_txpeer nor msg_rxpeer are set. As a result, all sends to lolnd result in a misleading error message in the dklog from lnet_is_health_check(): 00000400:00000200:18.0:1560465004.906149:0:28738:0:(lib-move.c:4730:LNetPut()) LNetPut msg ffff881f44e36a00 -> 12345-0@lo 00000400:00000200:18.0:1560465004.923925:0:28738:0:(lib-msg.c:859:lnet_is_health_check()) msg ffff881f44e36a00 failed too early to retry and send from this code: if ((msg->msg_tx_committed && !msg->msg_txpeer) ||
(msg->msg_rx_committed && !msg->msg_rxpeer)) {
CDEBUG(D_NET, "msg %p failed too early to retry and send\n",
msg);
return false;
}
We should only be printing this message if the message actually failed. i.e. the msg->msg_ev.status != 0. |
| Comments |
| Comment by Gerrit Updater [ 15/Jun/19 ] |
|
Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/35235 |
| Comment by Gerrit Updater [ 30/Jul/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35235/ |
| Comment by Peter Jones [ 30/Jul/19 ] |
|
Landed for 2.13 |