[LU-16452] kfilnd: Transaction deadline should be checked before every RDMA post operation. Created: 06/Jan/23  Updated: 11/Apr/23  Resolved: 19/Jan/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Improvement Priority: Minor
Reporter: Chris Horn Assignee: Chris Horn
Resolution: Fixed Votes: 0
Labels: None

Rank (Obsolete): 9223372036854775807

 Description   

Today, kfilnd does not check the transaction deadline before posting an RDMA operation. If for some reason, kfabric returns -EAGAIN for long periods of time, the impacted kfilnd transactions are just queued for replay. Since the transaction deadlines are not checked, it is possible these transactions are posted after the deadline expires.

Kfilnd should check transaction deadline before queueing timed out operations to kfabric.



 Comments   
Comment by Gerrit Updater [ 10/Jan/23 ]

"Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49593
Subject: LU-16452 kfilnd: Check replay deadline before send
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c1ea995ba607c84c829e01d97dd24abad66b2c5b

Comment by Gerrit Updater [ 19/Jan/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49593/
Subject: LU-16452 kfilnd: Check replay deadline before send
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 3049ba6ba1241770adeeeffbdfb6fef82bbf0b92

Comment by Peter Jones [ 19/Jan/23 ]

Landed for 2.16

Comment by Gerrit Updater [ 11/Apr/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49681/
Subject: LU-16452 tests: skip interop recovery-small/144a
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 3b28c0d2727e98425fe45a5add4527cc72a39432

Generated at Sat Feb 10 03:27:08 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.