Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.4.0
-
3
-
5230
Description
According to the analysis of LU-1717 we are frequently losing Lustre messages on Sequoia's IB network. We have no LNet routers, and IB is a reliable network. We are not seeing any timeouts or lnet errors that would suggest that we are seeing IB transmission problems.
Why are messages being lost that can result in LU-1717 error messages? I'm worried that we're papering over a larger problem by silencing those errors.
Attachments
Issue Links
- is related to
-
LU-1717 mdt_recovery.c:611:mdt_steal_ack_locks()) Resent req xid XXX has mismatched opc: new 101 old 0
- Resolved