Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4882

Convert MDS restoring RPC message from D_RPCTRACE to D_WARNING

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: Lustre 2.4.0, Lustre 2.5.0, Lustre 2.6.0
    • Fix Version/s: None
    • Labels:
      None
    • Severity:
      3
    • Rank (Obsolete):
      13504

      Description

      Cray recently had an issue where an unusual network problem was causing a large number of RPCs from clients to the MDS to be delivered twice. This was causing a very large number of RPCs to be restored, which, with a particular job, eventually lead to a bug that appears similar to LU-2827.

      In investigating this, we didn't notice the severe network problem because we had to turn on RPC tracing (which creates a huge message volume) and walk through the logs to see this issue.

      As restoring an RPC indicates something has gone wrong, even if it's being handled correctly, I'm suggesting changing this message in mdt_req_from_lcd from D_RPCTRACE to D_WARN to make the sort of issue we saw more obvious.

              DEBUG_REQ(D_RPCTRACE, req, "restoring transno "LPD64"/status %d",
                        req->rq_transno, req->rq_status);
      

      Patch will be available in Gerrit shortly.

        Attachments

          Activity

            People

            • Assignee:
              wc-triage WC Triage
              Reporter:
              paf Patrick Farrell (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: