[LU-3443] performance impact of mdc_rpc_lock serialization Created: 06/Jun/13 Updated: 16/Oct/13 Resolved: 16/Oct/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0, Lustre 2.1.4 |
| Fix Version/s: | Lustre 2.5.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Ned Bass | Assignee: | Alex Zhuravlev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | performance | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 8584 | ||||||||
| Description |
|
Serialization of in-flight RPCs on mdc_rpc_lock makes non-modifying operations, such as calling open() on a directory, vulnerable to blocking due to a slow backend MDS filesystem. In particular, users may see long delays running ls when LDLM_ENQUEUE requests get blocked behind long-lived metadata-modifying MDS_REINT requests. For example, in To that end, it would be helpful to clarify what the mdc_rpc_lock is protecting. mdc_enqueue() has this to say: 812 /* It is important to obtain rpc_lock first (if applicable), so that 813 * threads that are serialised with rpc_lock are not polluting our 814 * rpcs in flight counter. We do not do flock request limiting, though*/ 815 if (it) { 816 mdc_get_rpc_lock(obddev->u.cli.cl_rpc_lock, it); 817 rc = mdc_enter_request(&obddev->u.cli); but it's not clear to me what is meant by "polluting", and why the counter can't be protected by a separate lock that need no be held across the entire network request. LLNL-bug-id: TOSS-2084 |
| Comments |
| Comment by Alex Zhuravlev [ 07/Jun/13 ] |
|
this semaphore is needed to implement "execute-once" semantics on MDS. MDS remembers the last result/XID for every client in last_rcvd file, so if the client hasn't got a reply, it can just resend the request and MDS will reconstruct the reply being aware the request has been already executed. this is a known issue we're going to address with multislot last_rcvd at some point. it'll likely request small changes to the protocol so that the client can say which specific slot to use like NFSv4 does (if I remember correctly). |
| Comment by Andreas Dilger [ 07/Jun/13 ] |
|
Ned, in It might be possible to detect this based on the handle used for the open, but that would further increase the complexity of the recovery process, and only fix the non-O_CREAT open() case. The generic solution for allowing multiple modifying RPCs in flight is to allow multiple slots on the last_rcvd file for a single client. The exact mechanism for this has not been designed at this point, and is not on any development roadmap. To be honest, given the glacial speed of the OpenSFS RFP process, it is unclear how any feature development projects will be started in the foreseeable future. |
| Comment by Ned Bass [ 07/Jun/13 ] |
|
Alex, Andreas, thank you for filling in the details here. It sounds like there's nothing to do in the short term for this issue. Though if you are amenable, I'll submit a patch adding a comment block above the definition of struct mdc_rpc_lock to capture some of the useful information here. |
| Comment by Ned Bass [ 07/Jun/13 ] |
|
Since it's just a comment, I'll post it for review here first, to avoid wasting time for hudson/maloo. --- a/lustre/include/lustre_mdc.h +++ b/lustre/include/lustre_mdc.h @@ -69,9 +69,27 @@ struct obd_export; struct ptlrpc_request; struct obd_device; +/** + * Serializes in-flight MDT-modifying RPC requests to preserve idempotency. + * + * This mutex is used to implement execute-once semantics on the MDT. + * The MDT stores the last transaction ID and result for every client in + * its last_rcvd file. If the client doesn't get a reply, it can safely + * resend the request and the MDT will reconstruct the reply being aware + * that the request has already been executed. Without this lock, + * execution status of concurrent in-flight requests would be + * overwritten. + * + * This design limits the extent to which we can keep a full pipeline of + * in-flight requests from a single client. This limitation could be + * overcome by allowing multiple slots per client in the last_rcvd file. + */ struct mdc_rpc_lock { + /** Lock protecting in-flight RPC concurrency. */ struct mutex rpcl_mutex; + /** Intent associated with currently executing request. */ struct lookup_intent *rpcl_it; + /** Used for MDS/RPC load testing purposes. */ int rpcl_fakes; }; |
| Comment by Prakash Surya (Inactive) [ 07/Jun/13 ] |
Should I open a new ticket for this? |
| Comment by Ned Bass [ 07/Jun/13 ] |
|
Patch to add above comment: |
| Comment by Jodi Levi (Inactive) [ 16/Oct/13 ] |
|
Patch landed to Master. Please let me know if more work is needed in this ticket and I will reopen. |