[LU-15068] Race between commit callback and reply_out_callback::LNET_EVENT_SEND - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.15.0
Affects Version/s: None
Labels:
None

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

When LNet is under load it is possible for messages to be queued while waiting for a peer TX credit or a network TX credit. When running benchmarks on a large scale system we observed clients hitting "slow reply" timeouts for MDS_REINT RPCs. Tracing revealed that the server received the MDS_REINT RPC and sent a reply to the client, but the reply was queued in LNet because there weren't any peer credits available.

Shortly after, the commit callback was triggered which added the reply state to be handled via ptlrpc_commit_replies() -> rs_batch_add()

void ptlrpc_commit_replies(struct obd_export *exp)
{
...
                if (rs->rs_transno <= exp->exp_last_committed) {
                        list_del_init(&rs->rs_obd_list);
                        rs_batch_add(&batch, rs);
                }

The reply state MD handle then got unlinked by ptlrpc_handle_rs().

static int
ptlrpc_handle_rs(struct ptlrpc_reply_state *rs)
{
...
        if ((!been_handled && rs->rs_on_net) || nlocks > 0) {
                spin_unlock(&rs->rs_lock);

                if (!been_handled && rs->rs_on_net) {
                        LNetMDUnlink(rs->rs_md_h);

But the reply never left the server - it was always queued in LNet. Since the MD was unlinked, LNet aborted the send once a credit became available. Client eventually hit "timeout for slow reply" and this caused the client to reconnect.

I'm able to readily reproduce the issue using a four node cluster where I have 1 MDS, 1 OSS and 2 clients.
1. Run mdtest create
2. Start LST in the background - I'm doing a simultaneous read and write session where MDS is in the "to" group and the OSS and 2 clients are in the "from" group - concurrency 64
3. Run mdtest delete

LST causes credit starvation during the mdtest delete phase, and so the replies are more readily queued in LNet as I described above.

Attachments

Activity

People

Assignee:: Chris Horn

Reporter:: Chris Horn

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 06/Oct/21 4:02 PM

Updated:: 12/Mar/22 12:20 AM

Resolved:: 30/Nov/21 1:44 PM