Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.9.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      MLX5 does not support FMR. With the removal of PMR (LU-6850) we need a way to support MLX5.

      To do so the following solution is devised. FMR is enabled by setting map-on-demand to: 0 < value <=256.

      This represents a problem for coexistence nodes with both OPA and MLX5. OPA performance is greatly enhanced by setting map-on-demand, however if map-on-demand is set for MLX5, then it will not work.

      We need to be able to support per-NI map-on-demand value; therefore when OPA net is configured it can use optimal map-on-demand value, but when MLX5 is configured map-on-demand can be set to 0, disabling it.

      However, this raises another issue which is different map-on-demand values across fabrics. LNet currently doesn't support this. However LU-3322 adds support for this scenario. This problem is relevant for both OPA support and for support of clusters which have both MLX5 and MLX4 interconnected.

      The proposed solution is consistent of three patches.
      1. LU-6850 patch which removed support for PMR
      2. LU-3322 support for different map-on-demand and peertxcredits values across the network.
      3. The patch for this LU, which adds support for per NI map-on-demand.

      Future patch will add support for dynamic setting of map-on-demand, but that's a future feature not required to address the immediate need.

      Attachments

        Issue Links

          Activity

            [LU-7101] Lnet: Support per NI map-on-demand
            mdiep Minh Diep made changes -
            Link New: This issue is related to JFC-20 [ JFC-20 ]
            mdiep Minh Diep made changes -
            Link New: This issue is related to DDN-254 [ DDN-254 ]
            pjones Peter Jones made changes -
            Link New: This issue is duplicated by NEC-31 [ NEC-31 ]
            pjones Peter Jones made changes -
            Link New: This issue is related to LDEV-368 [ LDEV-368 ]
            pjones Peter Jones made changes -
            Link New: This issue is related to LDEV-371 [ LDEV-371 ]
            pjones Peter Jones made changes -
            Link Original: This issue is related to LDEV-341 [ LDEV-341 ]
            pjones Peter Jones made changes -
            Link New: This issue is related to LDEV-342 [ LDEV-342 ]
            ezell Matt Ezell made changes -
            Link New: This issue is related to LU-8022 [ LU-8022 ]
            jgmitter Joseph Gmitter (Inactive) made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            yujian Jian Yu made changes -
            Link New: This issue is related to LDEV-341 [ LDEV-341 ]

            People

              ashehata Amir Shehata (Inactive)
              ashehata Amir Shehata (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: