Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.16.0
-
3
-
9223372036854775807
Description
Both "lctl ping" and "lnetctl ping" always send a positive timeout value to the kernel when issuing a ping, and the default value is 1 sec. This has a few issues:
- On our large fabrics (e.g. el capitan) pings time out long before the fabric itself has given up transmitting the ping message, so the user gets a "ping failed" when in fact it's still in process, unless they specify the timeout option.
- This prevents the kernel from using the default timeout based on DEFAULT_PEER_TIMEOUT
- It's different than what lnetctl(8) says the default is