Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4882

Convert MDS restoring RPC message from D_RPCTRACE to D_WARNING

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.4.0, Lustre 2.5.0, Lustre 2.6.0
    • 3
    • 13504

    Description

      Cray recently had an issue where an unusual network problem was causing a large number of RPCs from clients to the MDS to be delivered twice. This was causing a very large number of RPCs to be restored, which, with a particular job, eventually lead to a bug that appears similar to LU-2827.

      In investigating this, we didn't notice the severe network problem because we had to turn on RPC tracing (which creates a huge message volume) and walk through the logs to see this issue.

      As restoring an RPC indicates something has gone wrong, even if it's being handled correctly, I'm suggesting changing this message in mdt_req_from_lcd from D_RPCTRACE to D_WARN to make the sort of issue we saw more obvious.

              DEBUG_REQ(D_RPCTRACE, req, "restoring transno "LPD64"/status %d",
                        req->rq_transno, req->rq_status);
      

      Patch will be available in Gerrit shortly.

      Attachments

        Activity

          People

            wc-triage WC Triage
            paf Patrick Farrell
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: