[LU-9932] LU-9026, LU-9500, and LU-9472 backports Created: 30/Aug/17  Updated: 12/Oct/17  Resolved: 12/Oct/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Critical
Reporter: Giuseppe Di Natale (Inactive) Assignee: Sonia Sharma (Inactive)
Resolution: Done Votes: 0
Labels: llnl

Issue Links:
Duplicate
duplicates LU-9472 FastReg (MLX5) support breaks when ma... Resolved
duplicates LU-9500 MOFED 4/mlx5: Aligning non-aligned pa... Resolved
duplicates LU-9026 Adapt to the removal of ib_get_dma_mr() Resolved
Rank (Obsolete): 9223372036854775807

 Description   

We have recently experienced the LNetError: Async QP event type 3 issue described in LU-9461. There appears to be a series of fixes from tickets LU-9026, LU-9500, and LU-9472 which all landed in later versions of lustre. Could these be backported to lustre 2.5 and 2.8?



 Comments   
Comment by Giuseppe Di Natale (Inactive) [ 30/Aug/17 ]

Peter, is it possible to have these patches backported by early next week? This is holding up some RHEL 7.4 testing on our end.

Comment by Peter Jones [ 30/Aug/17 ]

Sonia

Could you please assist with this request?

Thanks

Peter

Comment by Ruth Klundt (Inactive) [ 31/Aug/17 ]

Fyi Sandia is also in need of these patches since we run the LLNL software stack. Thanks!

Comment by Giuseppe Di Natale (Inactive) [ 31/Aug/17 ]

This is a response to Sonia's comment, for some reason I can't see it in JIRA, but was emailed regarding it.

Thank you for porting the patches to the 2.8 branch!

Unfortunately, we will need them ported to the 2.5 branch as well. We are planning to run Lustre 2.5 on RHEL 7.4 and we need those fixes in place.

Comment by Olaf Faaland [ 01/Sep/17 ]

Sonia and Peter,

I see all three patches passed the automated test suite for the b2_8_fe branch.

No reviews yet. Anyone you can poke?

thanks

Comment by Peter Jones [ 01/Sep/17 ]

ofaaland it's on my list for today!

Comment by Giuseppe Di Natale (Inactive) [ 07/Sep/17 ]

Hi Sonia and Peter,

It appears there's a 4th o2ib related patch that needs a couple of reviews (https://review.whamcloud.com/28842). Is that part of this ticket? Also, it appears the other 3 patches are good to go. Can they be merged into b2_8_fe?

Thanks!

Comment by Giuseppe Di Natale (Inactive) [ 11/Sep/17 ]

Peter,

Can we get the 4 patches merged today? We still need to test them on our end and this is holding us up.

Comment by Olaf Faaland [ 06/Oct/17 ]

A new issue, appears related to RHEL 7.4 IB changes, that we are encountering:
LU-10089

Comment by Olaf Faaland [ 12/Oct/17 ]

Resolving since these backports are done.  There were other issues related to MLX IB and RHEL74 kernel but there is a separate ticket for them.

Generated at Sat Feb 10 02:30:34 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.