[LU-12444] Remove ambiguous request flag of no_resend Created: 17/Jun/19 Updated: 17/Jun/19 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Minor |
| Reporter: | Li Xi | Assignee: | Li Xi |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Two flags of rq_no_delay and rq_no_resend might not be necessary. We don't have precise definitions and uages conditions for them to distinguish them. I think rq_no_delay means the request should quit and return whenever there is any possibility of being blocked, either it is caused by reconnection or other conditions. And rq_no_resend should always be set when rq_no_delay is set, which is true in a lot of places but not all. It is not clear why rq_no_resend is necessary. Even there is a case in which rq_no_delay is not suitable, a more precise flag or a better mechanism should be used. |
| Comments |
| Comment by Gerrit Updater [ 17/Jun/19 ] |
|
Li Xi (lixi@ddn.com) uploaded a new patch: https://review.whamcloud.com/35244 |
| Comment by Patrick Farrell (Inactive) [ 17/Jun/19 ] |
|
Li Xi, Thank you very much for diving in and trying this... |
| Comment by Andreas Dilger [ 17/Jun/19 ] |
|
IMHO, no_resend has a clear meaning - the RPC may be queued on the client, but it gets one chance to be sent and if it times out there is no reason to resend it. I don't think this is the same as "never" blocking an RPC for no_delay. What constitutes "blocking"? Local memory allocation, queue delay in the network, other? |
| Comment by Li Xi [ 17/Jun/19 ] |
|
OK. Then "blocking" needs more detailed definition. But I think the use case of no_delay seems clear: quit whenever it hits problem/failure when trying to proceed, or seeing high possibility of problem/failure if proceed. So, if memory allocation failure, yes, no_delay request would quit. And if the request handler foresees high possibility of slow memory allocation or failure of memory allocation, no_delay request would quit too.
This sounds like part of no_delay's functionality. I am wondering whether there is any possibility to re-use no_delay for (most of) the cases when no_resend is used. And I know there might be some cases when rq_no_delay is not suitable or not enough. And my attemption here is to check what these cases are. "req->rq_no_resend = req->rq_no_delay = 1" is written in a lot of places. So, I feel this might simplify the logic in general. At least, even cleaning up rq_no_resend is too complex to come true, we still need a patch adding some comments to explicitly explain the difference between these two flags to avoid future confusion. |