[LU-9500] MOFED 4/mlx5: Aligning non-aligned page addresses trigger dump_cqe Created: 13/May/17  Updated: 01/Sep/20  Resolved: 22/Jul/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.10.1, Lustre 2.11.0

Type: Bug Priority: Critical
Reporter: Doug Oucharek (Inactive) Assignee: Sonia Sharma (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: Text File dump-LU-9500.log    
Issue Links:
Blocker
is blocked by LU-9565 lnet-selftest: Newly added "off" para... Open
Duplicate
is duplicated by LU-9932 LU-9026, LU-9500, and LU-9472 backports Resolved
Related
is related to LU-9472 FastReg (MLX5) support breaks when ma... Resolved
is related to LU-9983 LBUG llog_osd.c:327:llog_osd_declare_... Resolved
is related to LU-9461 lustre client mount fail after update... Resolved
is related to LU-9679 Prepare lustre for adoption into the ... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

In Lustre, we allow the first fragment in an IOV-based message to be non-page aligned.  When we set up the scatter/gather list we properly set the address and page_offset to reflect the non-page alignment.  

When we assign a remote address for RDMA purposes, the current code masks the address so it is page aligned.  When the page aligned address does not match the address in the scatter/gather list, the mlx5 driver under MOFED 4 is rejecting the IB_RDMA_WRITE operation by doing a "dump_cqe" error message.

That is the main problem to be fixed.  However, the code which was doing the masking for page alignment is wrong.  Here is the line of code in the routine kiblnd_fmr_map_tx() which is doing the masking incorrectly:

rd->rd_frags[0].rf_addr &= ~hdev->ibh_page_mask;

The "~" should not be there. We were setting the rf_addr to the page offset. When pages are aligned, rf_addr becomes zero and that is the remote_addr value we send to the other node. The fact that this works and does not break things sort of implies that the MOFED code is not using the remote_addr field of a IB_RDMA_WRITE work request.

In any case, we need to fix this in case some day some code does actually pay attention to this field.

The question to be answered here: should the remote address we generate be page aligned or not. When I stopped page aligning it, the dump_cqe error stopped and everything worked just fine.
 



 Comments   
Comment by Gerrit Updater [ 16/May/17 ]

Doug Oucharek (doug.s.oucharek@intel.com) uploaded a new patch: https://review.whamcloud.com/27149
Subject: LU-9500 lnd: Don't Page Align remote_addr with FastReg
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b0e0556df581300b976536c2c16351fe4ed492b7

Comment by Alexey Lyashkov [ 24/May/17 ]

Doug,

patch looks fine for me. But looks we need to have same for other memory registration modes.

But I will be like to ask Jay to review a CLIO code to avoid unaligned address using.
Lustre locks is page aligned always, so we should have a single way to have unaligned address - direct IO code. I will avoid problem with many fragments on routers discussed before and needs a two SGE per WR.

Comment by James A Simmons [ 05/Jun/17 ]

Hi Doug.

So I tested on our RHEL7 with default OFED using mlx4 driver and the latest patch worked. I need to test it on a few configurations. I have:

1) SLES11 SP3 with OFED 311 stack using mlx4 hardware, maybe mlx5. Have to ask.

2) Power8 RHEL7.3 with MOFED 3.3 with mlx5 hardware

3) Power8 RHEL7.3 with MOFED 4.X with mlx5 hardware (needs to be set up)

I will let you know the results.

Comment by Gerrit Updater [ 22/Jul/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27149/
Subject: LU-9500 lnd: Don't Page Align remote_addr with FastReg
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6c6341804133ea0a4d4535c621f28f61fe6c29ab

Comment by Minh Diep [ 22/Jul/17 ]

landed in lustre 2.11.0

Comment by Gerrit Updater [ 26/Jul/17 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28237
Subject: LU-9500 lnd: Don't Page Align remote_addr with FastReg
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 25c54cbd1c4a2b02bab548b0feed96ad635af70f

Comment by Gerrit Updater [ 07/Aug/17 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28237/
Subject: LU-9500 lnd: Don't Page Align remote_addr with FastReg
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: f87c7c2cee6fc5a0864a757917a414dc605554b3

Comment by Doug Oucharek (Inactive) [ 17/Aug/17 ]

Has this been pushed upstream yet?

Comment by James A Simmons [ 17/Aug/17 ]

Not yet.

Generated at Sat Feb 10 02:26:44 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.