Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17072

LNet dependency to RoCE v1

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.16.0
    • None
    • Rocky Linux 9.2 – 5.14.0-284.25.1.el9_2.x86_64 - Broadcom BCM57414
    • 4
    • 9223372036854775807

    Description

      During testing with master (2.15.57_130_g40c4041 / 40c404129b8ee51af5da7ec422672cc1eba74cbe) on EL9.2 with RoCE network, we noticed that LNet must have some dependencies with RoCE v1 being enabled. If only RoCE v2 is enabled and NOT v1, while the IB layer seems to work well (ib_write_bw, ibv_rc_pingpong, etc.), LNet doesn't work. Attaching a debug trace of a lctl ping on itself (using @o2ib), which doesn't succeed.

      In our case, enabling RoCE v1 on the hardware fixes the issue with LNet:

      # bnxtnvm -dev=$ROCEIF setoption=disable_roce_v1#0
      

      Attachments

        Activity

          People

            wc-triage WC Triage
            sthiell Stephane Thiell
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: