[LU-7101] Lnet: Support per NI map-on-demand Created: 04/Sep/15  Updated: 30/Jan/17  Resolved: 11/Apr/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Major
Reporter: Amir Shehata (Inactive) Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
Related
is related to LU-8022 LNet: BUG: unable to handle kernel NU... Resolved
is related to LU-3322 ko2iblnd support for different map_on... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

MLX5 does not support FMR. With the removal of PMR (LU-6850) we need a way to support MLX5.

To do so the following solution is devised. FMR is enabled by setting map-on-demand to: 0 < value <=256.

This represents a problem for coexistence nodes with both OPA and MLX5. OPA performance is greatly enhanced by setting map-on-demand, however if map-on-demand is set for MLX5, then it will not work.

We need to be able to support per-NI map-on-demand value; therefore when OPA net is configured it can use optimal map-on-demand value, but when MLX5 is configured map-on-demand can be set to 0, disabling it.

However, this raises another issue which is different map-on-demand values across fabrics. LNet currently doesn't support this. However LU-3322 adds support for this scenario. This problem is relevant for both OPA support and for support of clusters which have both MLX5 and MLX4 interconnected.

The proposed solution is consistent of three patches.
1. LU-6850 patch which removed support for PMR
2. LU-3322 support for different map-on-demand and peertxcredits values across the network.
3. The patch for this LU, which adds support for per NI map-on-demand.

Future patch will add support for dynamic setting of map-on-demand, but that's a future feature not required to address the immediate need.



 Comments   
Comment by Gerrit Updater [ 11/Sep/15 ]

Amir Shehata (amir.shehata@intel.com) uploaded a new patch: http://review.whamcloud.com/16367
Subject: LU-7101 lnet: per NI map-on-demand value
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e77ed3e90197a749ba4d301ae5414731f861b5cd

Comment by James A Simmons [ 11/Sep/15 ]

I just tested this patch set and got it to work. For this set of test I didn't use map_on_demand at all. For the mlx5 driver map_on_demand will not work. It always gives the follow error no matter what value I set to map_on_demand:

LNetError: 8048:0:(o2iblnd.c:2242:kiblnd_net_init_pools()) Can't set fmr pool size (512) < ntx / 4(1280)
[ 3766.868871] LNetError: 8048:0:(o2iblnd.c:3110:kiblnd_startup()) Failed to initialize NI pools: -22

My config string is:

options ko2iblnd timeout=100 credits=2560 ntx=5120 peer_credits=63 concurrent_sends=63

What I did find that work with the mlx5 driver is:

options ko2iblnd timeout=100 credits=2560 ntx=5120 peer_credits=16 concurrent_sends=16

On the server side I'm still using the config string:

options ko2iblnd timeout=100 credits=2560 ntx=5120 peer_credits=63 concurrent_sends=63

I haven't tried map_on_demand on the server side yet. Any suggestions to bump up the peer_credits?

Comment by Jeremy Filizetti [ 24/Sep/15 ]

In lustre's current o2iblnd LND map_on_demand != 0 is equivalent to enabling FMR. Since mlx5 doesn't support FMR it will fail. The error you are seeing is because fmr_pool_size defaults to 512 but even if you change it to a larger value though it should then fail in kiblnd_create_fmr_pool on the call to ib_create_fmr_pool.

Comment by James A Simmons [ 06/Nov/15 ]

Latest patch update has a new look for lnetctl net show -v

net:

  • net: lo
    nid: 0@lo
    status: up
    tunables:
    peer_timeout: 0
    peer_credits: 0
    peer_buffer_credits: 0
    credits: 0
    CPT: "[0,0]"
  • net: o2ib1
    nid: 10.37.248.19@o2ib1
    status: up
    interfaces:
    0: ib0
    tunables:
    peer_timeout: 180
    peer_buffer_credits: 0
    peer_credits: 63
    credits: 2560
    CPT: "[0,0]"
    LND tunables:
    use_privileged_port: 1
    require_privileged_port: 0
    dev_failover: 0
    fmr_cache: 1
    fmr_flush_trigger: 1024
    fmr_pool_size: 1280
    map_on_demand: 256
    concurrent_sends: 63
    ib_mtu: 0
    keepalive: 100
    rnr_retry_count: 6
    retry_count: 5
    ipif_name: ib0
    peer_credits_hiw: 31
    ntx: 5120
    nscheds: 0
    timeout: 100
    cksum: 0
    service: 987
Comment by James A Simmons [ 16/Dec/15 ]

Updated the patch to support setting FMR pool parameters as well. The patch is flexible enough to allow different settings on different IB ports on the same node. See the output of lnetctl net show -v
This is big step forward in that we can now support different types of IIB hardware in the same node and configure each one independently.

net:

  • net: lo
    nid: 0.0.0.0@lo
    status: up
    tunables:
    peer_timeout: 0
    peer_credits: 0
    peer_buffer_credits: 0
    credits: 0
  • net: o2ib1
    nid: pike11-ib0@o2ib1
    status: up
    interfaces:
    0: ib0
    tunables:
    peer_timeout: 100
    peer_credits: 16
    peer_buffer_credits: 0
    credits: 2560
    CPT: "[0,0]"
    LND tunables:
    peercredits_hiw: 8
    map_on_demand: 256
    concurrent_sends: 32
    fmr_pool_size: 1280
    fmr_flush_trigger: 1024
    fmr_cache: 1
  • net: o2ib2
    nid: pike12-ib1@o2ib2
    status: up
    interfaces:
    0: ib1
    tunables:
    peer_timeout: 180
    peer_credits: 16
    peer_buffer_credits: 0
    credits: 2560
    CPT: "[0,0,0,0]"
    LND tunables:
    peercredits_hiw: 8
    map_on_demand: 0
    concurrent_sends: 16
    fmr_pool_size: 512
    fmr_flush_trigger: 384
    fmr_cache: 1
Comment by Gerrit Updater [ 07/Apr/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16367/
Subject: LU-7101 lnet: per NI map-on-demand value
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 94d1ce46562bd36c8959a9458cacabb7f6df681f

Comment by Joseph Gmitter (Inactive) [ 11/Apr/16 ]

Landed to master for 2.9.0

Generated at Sat Feb 10 02:06:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.