[LU-15250] RPC Replay Signature Created: 18/Nov/21  Updated: 19/Nov/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-3290 disallow ptlrpc RPCs with old client ... Open
is related to LU-5703 Quiesce client mountpoints from the s... Open
Bugzilla ID: 18,657
Rank (Obsolete): 9223372036854775807

 Description   

From https://bugzilla.lustre.org/show_bug.cgi?id=18657

In order to prevent clients from incorrectly replaying saved RPC operations after an server failure, it would be desirable to generate a signature of the RPC request on the MDS before replying to the client. The request signature generated on the server would be stored by the client in the saved RPC request that would be sent to the server again in case of recovery, and the signature could be verified by the server after restart to ensure that the RPC is still valid for replay (correct XID for ordering, etc).

Since the client will typically update the RPC request after receiving the server reply (to insert the transno, file layout, etc.), the MDS needs to "preformat" the RPC request in the same way that the client will later send it for replay before generating the signature.

Since only the server will need to generate and verify the RPC signature, it does not need to use a public-key signature, it may be enough to generate a local key periodically that is persistently stored on the target, and the prior key should be kept for a few minutes (subject to XID aging limitations, LU-3290) to ensure that a server crash shortly after key generation does not prevent slightly older RPCs from being replayed.

This would also require the changes from Simplified Interoperability (LU-5703 has an old presentation on this) to allow clients to reopen files on the MDS by FID instead of RPC replay, so that these long-lived open RPCs do not need to be saved explicitly on the client for every open file handle, and do not the exact RPC format for replay, which may not work with older servers and reduces memory usage.



 Comments   
Comment by Patrick Farrell [ 18/Nov/21 ]

Sorry, I misunderstood the use case slightly.  Disregard my previous.

Comment by Patrick Farrell [ 18/Nov/21 ]

What information will this signature capture that is not already in the RPC itself when replayed?  How will it aid in making this choice?  (FYI, the bugzilla link doesn't work for me)

Comment by Andreas Dilger [ 19/Nov/21 ]

The bugzilla link no longer works, since bugzilla.lustre.org has been taken offline a while ago, so I created this ticket to capture information that was stored there and referenced by the Lustre Projects page.

Generated at Sat Feb 10 03:16:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.