[LU-9179] Upstream ko2iblnd has poor performance Created: 04/Mar/17  Updated: 03/Nov/18  Resolved: 03/Nov/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Doug Oucharek (Inactive) Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: lnet

Issue Links:
Related
is related to LU-9679 Prepare lustre for adoption into the ... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

On a MLX QDR system, I get the following performance with current master on RHEL 7.3:

Read: 3.1 GB/s, Write: 3.15 GB/s

With the latest upstream build and LU-9026 fix, I am getting:

Read: 1.25 GB/s, Write: 1.13 GB/s

To see if the problem is due to LU-9026, I went back to before the RDMA API changes which broke ko2iblnd (4.8 rc2) and got:

Read: 0.63 GB/s, Write: 0.62 GB/s

So, I feel we have a bad problem with upstream LNet IB performance.  It is possible that lnet-selftest is broken (certainly for 4.8rc2, that is possible).

I'm still unable to validate LU-9026 on the upstream client.  In theory, I get the same effect on master by setting map_on_demand to 256.  When I do that, I see about a 5% drop in performance only.  So, my suspicion is we have a problem with ko2i



 Comments   
Comment by James A Simmons [ 13/Mar/17 ]

The LNet layer upstream is pretty much in sync with master just before multi-rail landed. The major difference is Al Viro's biovec patches are missing in master.

Comment by James A Simmons [ 03/Nov/18 ]

This was due to left overs from LU-7650 which was incorrect. All the code has been removed upstream and replaced with what landed during 2.11 development cycle.

Generated at Sat Feb 10 02:23:55 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.