Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9072

Upstream lnet-selftest causing node to crash on load

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Upstream
    • Upstream
    • None
    • 3
    • 9223372036854775807

    Description

      The upstream version of lnet-selftest is triggering a node crash when it is loaded.  We know that a kernel developer changed the definition of kiov in both LNet and lnet-selftest.  The crash may be related.  In one run, I saw this log before crashing:

      LNet: 16216:0:(framework.c:1712:sfw_startup()) Failed to reserve enough buffers: service debug, 256 needed: -30720

      This may be a "hint" we are running out of memory and thereby causing instability in the kernel. It may be the new kiov system is causing memory exhaustion especially when allocating per-CPT.

      Attachments

        Activity

          People

            doug Doug Oucharek (Inactive)
            doug Doug Oucharek (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: