[LU-12532] LNet Health: Resending optimized GET broken Created: 10/Jul/19  Updated: 22/Jul/19

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Amir Shehata (Inactive) Assignee: Amir Shehata (Inactive)
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The concept of an optimized GET allows data to be RDMAed directly into the buffer with no need for an explicit REPLY message. In order to achieve this the LND allocates a reply message when initially processing the GET send request. Then it calls LNet finalize on both the original GET and the "fake" REPLY message when the RDMA operation is complete.

This, however, presents a problem for resends. When an MD is allocated if it expects a response the threshold of the MD is set to 2. The threshold is decremented when the MD is attached to the GET message, and then decremented again when the MD is attached to the REPLY message.

When resending, the MD threshold is already 0, so when the LND allocates the REPLY message, it fails because the threshold is already 0.

We need to redesign the optimized GET processing to work with resends.



 Comments   
Comment by Gerrit Updater [ 22/Jul/19 ]

(Typo in LU number)

Generated at Sat Feb 10 02:53:25 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.