Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2187

Why are we losing messages?

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.4.0
    • 3
    • 5230

    Description

      According to the analysis of LU-1717 we are frequently losing Lustre messages on Sequoia's IB network. We have no LNet routers, and IB is a reliable network. We are not seeing any timeouts or lnet errors that would suggest that we are seeing IB transmission problems.

      Why are messages being lost that can result in LU-1717 error messages? I'm worried that we're papering over a larger problem by silencing those errors.

      Attachments

        Issue Links

          Activity

            People

              doug Doug Oucharek (Inactive)
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: