Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17632

o2iblnd: graceful handling of unexpected CM_EVENT_CONNECT_ERROR

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Minor Minor
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

      There were examples in the field with RoCE setups which demonstrate that RDMA_CM_EVENT_CONNECT_ERROR may be received when connection is neither in IBLND_CONN_ACTIVE_CONNECT nor IBLND_CONN_PASSIVE_WAIT state

      This causes the assertion in kiblnd_cm_callback() to fail:

       ASSERTION( conn->ibc_state == 1 || conn->ibc_state == 2 )

      It is proposed to handle this in a more gracious manner:  report the event as unexpected and allow the flow to continue. If there are indeed issues on the connection, it is expected to report transaction errors and get cleaned up without crashing the whole system.

            ssmirnov Serguei Smirnov
            ssmirnov Serguei Smirnov
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: