Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-19517

lctl and lnet ping both use a default timeout that is too small and fixed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.16.0
    • 3
    • 9223372036854775807

    Description

      Both "lctl ping" and "lnetctl ping" always send a positive timeout value to the kernel when issuing a ping, and the default value is 1 sec.  This has a few issues:

      • On our large fabrics (e.g. el capitan) pings time out long before the fabric itself has given up transmitting the ping message, so the user gets a "ping failed" when in fact it's still in process, unless they specify the timeout option.
      • This prevents the kernel from using the default timeout based on DEFAULT_PEER_TIMEOUT
      • It's different than what lnetctl(8) says the default is

      Attachments

        Activity

          People

            ofaaland Olaf Faaland
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: