[LU-11105] Seeing "Using FastReg with no GAPS support" that can't be resolved Created: 27/Jun/18  Updated: 08/Aug/18  Resolved: 08/Aug/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0, Lustre 2.12.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: James A Simmons Assignee: Amir Shehata (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Environment:

RHEL7.5 with default IB stack. Both clients and server back end running the default IB stack. All running lustre 2.11 but this affects 2.12 as well since LNet has not changed between version.s


Issue Links:
Duplicate
duplicates LU-11064 o2iblnd fast reg gaps case is determi... Resolved
Related
is related to LU-10394 IB_MR_TYPE_SG_GAPS mlx5 LNet performa... Resolved
Epic/Theme: lnet
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

I found that I was unable to read the ones that were 5K in size. This occurs on the f2-util01 host that has a 2.11 client. It spews errors and eventually times out. To test if it was a 2.7 client <-> 2.11 client issue, I created another 5K text file from f2-util01. I found that I was unable to read that one as well. Identify what the issue was and pointed out that it was showing this in the logs:

LNetError: 106211:0:(o2iblnd_cb.c:571:kiblnd_fmr_map_tx()) Using FastReg with no GAPS support, but tx has gaps. Try setting use_fastreg_gaps to 1

LNetError: 106211:0:(o2iblnd_cb.c:571:kiblnd_fmr_map_tx()) Skipped 477 previous similar messages           

LNetError: 106211:0:(o2iblnd_cb.c:1884:kiblnd_recv()) Can't setup PUT sink for 10.10.33.32@o2ib2: -93      

LNetError: 106211:0:(o2iblnd_cb.c:1884:kiblnd_recv()) Skipped 477 previous similar messages

I then set that option in the ko2iblnd module parameters and brought everything back up but it was still encountering that issue. 

If, however, I dd either of the files to /dev/null first I am then able to read them normally. This was the case for both user_fastreg_gaps set to 1 or 0.



 Comments   
Comment by Amir Shehata (Inactive) [ 27/Jun/18 ]

Can you try the patch here:

https://jira.whamcloud.com/browse/LU-11064

 

Comment by James A Simmons [ 27/Jun/18 ]

Is this needed for server as well client side?

Comment by Amir Shehata (Inactive) [ 27/Jun/18 ]

it should be applied on all 2.11 nodes.

Comment by James A Simmons [ 27/Jun/18 ]

Yes that patch appears to have resolved our problems.

Comment by Peter Jones [ 08/Aug/18 ]

Seems to have been a duplicate of LU-11064

Generated at Sat Feb 10 02:40:59 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.