[LU-9810] Melanox OFED 4.1 support Created: 31/Jul/17  Updated: 02/Mar/19  Resolved: 04/May/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.0
Fix Version/s: Lustre 2.11.0, Lustre 2.10.7

Type: Improvement Priority: Major
Reporter: Alexey Lyashkov Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-9990 MDS fails to mount due to (client.c:9... Resolved
is related to LU-9983 LBUG llog_osd.c:327:llog_osd_declare_... Resolved
Rank (Obsolete): 9223372036854775807

 Comments   
Comment by Gerrit Updater [ 31/Jul/17 ]

Alexey Lyashkov (alexey.lyashkov@seagate.com) uploaded a new patch: https://review.whamcloud.com/28277
Subject: LU-9810 lnet: fix build with M-OFED 4.1
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e4a38ee8ac104d1725b6a0b41af8e1b794833e44

Comment by Gerrit Updater [ 31/Jul/17 ]

Alexey Lyashkov (alexey.lyashkov@seagate.com) uploaded a new patch: https://review.whamcloud.com/28278
Subject: LU-9810 lnet: prefer Fast Reg
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5365d3dc05b1898878160d0d639bde912ed71e74

Comment by Gerrit Updater [ 31/Jul/17 ]

Alexey Lyashkov (alexey.lyashkov@seagate.com) uploaded a new patch: https://review.whamcloud.com/28279
Subject: LU-9810 lnet: use less CQ entries for each connection
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a2f4f7e33d224112b1edf5ae8b6f09f91ecd7396

Comment by Gerrit Updater [ 31/Jul/17 ]

Alexey Lyashkov (alexey.lyashkov@seagate.com) uploaded a new patch: https://review.whamcloud.com/28280
Subject: LU-9810 lnet: device MR attribute caching
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 25f020a3dfe0395bee25723012bf937ae7e4c412

Comment by Peter Jones [ 31/Jul/17 ]

Amir

Could you please review these proposed changes?

Thanks

Peter

Comment by Gerrit Updater [ 13/Sep/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28277/
Subject: LU-9810 lnet: fix build with M-OFED 4.1
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 344b6fd6934b30665e7ea172b5793c3f4f5adc57

Comment by Gerrit Updater [ 13/Sep/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28278/
Subject: LU-9810 lnet: prefer Fast Reg
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 8f0d0f052a516a5dd3e588ced6b49c840584855c

Comment by James A Simmons [ 14/Sep/17 ]

With the prefer Fast Reg patch I'm seeing the follow error using older mlx4 FDR hardware with RHEL7.4.

10000000:01000000:5.0:1505423143.375531:0:6932:0:(mgc_request.c:1205:mgc_target_register()) register lustre-MDT0000
00000800:00000100:1.0:1505423143.375818:0:3469:0:(o2iblnd_cb.c:3464:kiblnd_complete()) FastReg failed: 6
00000800:00000100:1.0:1505423143.375821:0:3469:0:(o2iblnd_cb.c:3475:kiblnd_complete()) RDMA (tx: ffffc90006c03728) failed: 5
00000800:00000100:1.0:1505423143.375834:0:3469:0:(o2iblnd_cb.c:967:kiblnd_tx_complete()) Tx -> 10.37.248.196@o2ib1 cookie 0x1 sending 1 waiting 0: failed 5
00000800:00000100:1.0:1505423143.375838:0:3469:0:(o2iblnd_cb.c:1919:kiblnd_close_conn_locked()) Closing conn to 10.37.248.196@o2ib1: error -5(waiting)
00000100:00000400:5.0:1505423143.375874:0:6932:0:(client.c:2113:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1505423143/real 1505423143] req@ffff881011e5c300 x1578550577594400/t0(0) o253->MGC10.37.248.196@o2ib1@10.37.248.196@o2ib1:26/25 lens 4768/4768 e 0 to 1 dl 1505423150 ref 2 fl Rpc:eX/0/ffffffff rc 0/-1

Comment by James A Simmons [ 22/Sep/17 ]

>what HW you use for testing? if it MLX5, these patches do nothing for you. MLX5 uses a FastReg only model, >while MLX4 support a both Fast and FMR.

The FDR card I have uses the mlx4 driver.

Comment by Gerrit Updater [ 05/Oct/17 ]

sorry this was meant to be for LU-9990

Amir Shehata (amir.shehata@intel.com) uploaded a new patch: https://review.whamcloud.com/29333
Subject: LU-9810 lnet: add backwards compatibility for YAML config
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e72cdc1373dfb930eccbc5d9afba215d8368b331

Comment by Gerrit Updater [ 22/Dec/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28279/
Subject: LU-9810 lnd: use less CQ entries for each connection
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 052f76bf708414b3a127aa9602b4a69415c1cb2f

Comment by Peter Jones [ 04/May/18 ]

I believe that no further work is outstanding on this ticket

Comment by Gerrit Updater [ 07/Jan/19 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33974
Subject: LU-9810 lnd: use less CQ entries for each connection
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 1045cb96f4e3c36e3003be9f8797a6b80e982bea

Comment by Gerrit Updater [ 19/Jan/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33974/
Subject: LU-9810 lnd: use less CQ entries for each connection
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: 31e16f27ccb18d7c2eb5169f33b1ac55823cc90b

Comment by Gerrit Updater [ 26/Feb/19 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34322
Subject: LU-9810 lnet: fix build with M-OFED 4.1
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: fabd62d9773df6e56b5d0426cb46d1d7e3204d82

Comment by Gerrit Updater [ 02/Mar/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34322/
Subject: LU-9810 lnet: fix build with M-OFED 4.1
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: b4c93d99c633003d90f478d999805c76ccd744f1

Generated at Sat Feb 10 02:29:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.