[LU-5652] client eviction if lock enqueue reply is lost Created: 23/Sep/14  Updated: 30/Jan/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Liang Zhen (Inactive) Assignee: Liang Zhen (Inactive)
Resolution: Unresolved Votes: 0
Labels: mq414, patch

Severity: 3
Rank (Obsolete): 15838

 Description   

A client will be evicted in this case:

  • ldlm lock is granted while sending lock enqueue reply
  • another thread tries to enqueue conflicting lock which can either set LDLM_FL_AST_SENT for this reply, or send blocking AST.*
  • lock enqueue reply is lost
  • RPC deadline on client side is longer than waiting lock deadline

If all these happened, this client will be evicted even with AST resend (LU-5520)

This patch is a workaround, it will guarantee waiting lock deadline is longer than server RPC deadline, which should be close to client side RPC deadline, so client at least has a chance to resend RPC.

This patch cannot help if there are multiple messages lost, for example, if resent RPC is lost again. Also, if there is huge network latency like router congestion on the path from client to server, then we may still have client eviction even with this patch, because server has no idea about network latency.



 Comments   
Comment by Liang Zhen (Inactive) [ 23/Sep/14 ]

patch is here http://review.whamcloud.com/12017
it's not ready for product yet.

Generated at Sat Feb 10 01:53:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.