[LU-10394] IB_MR_TYPE_SG_GAPS mlx5 LNet performance drop Created: 14/Dec/17 Updated: 07/Jan/19 Resolved: 09/Feb/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.11.0 |
| Fix Version/s: | Lustre 2.11.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Ian Ziemba | Assignee: | Amir Shehata (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
CentOS 7.4 |
||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
mlx5 performance is down 2+ GB/s when using IB_MR_TYPE_SG_GAPS as compared to IB_MR_TYPE_MEM_REG.
mlx5 with SG GAPS ---------------------------------------------------------- Running test: lst add_test --batch rperf --concurrency 32 --distribute 1:1 --from clients --to servers brw read size=1M Client Read RPC/s: 17426.3333333333 Client Write RPC/s: 8713.77777777778 Client Read MiB/s: 8713.46111111111 Client Write MiB/s: 1.33 ---------------------------------------------------------- Running test: lst add_test --batch rperf --concurrency 64 --distribute 1:1 --from clients --to servers brw read size=1M Client Read RPC/s: 17408.3333333333 Client Write RPC/s: 8705.22222222222 Client Read MiB/s: 8704.06666666667 Client Write MiB/s: 1.33 ---------------------------------------------------------- Running test: lst add_test --batch rperf --concurrency 128 --distribute 1:1 --from clients --to servers brw read size=1M Client Read RPC/s: 17388.4444444444 Client Write RPC/s: 8697 Client Read MiB/s: 8695.54666666667 Client Write MiB/s: 1.32777777777778 ---------------------------------------------------------- Running test: lst add_test --batch wperf --concurrency 32 --distribute 1:1 --from clients --to servers brw write size=1M Client Read RPC/s: 17712.1111111111 Client Write RPC/s: 8856.55555555555 Client Read MiB/s: 1.35 Client Write MiB/s: 8855.53111111111 ---------------------------------------------------------- Running test: lst add_test --batch wperf --concurrency 64 --distribute 1:1 --from clients --to servers brw write size=1M Client Read RPC/s: 17705.7777777778 Client Write RPC/s: 8853.66666666667 Client Read MiB/s: 1.35 Client Write MiB/s: 8853.18555555556 ---------------------------------------------------------- Running test: lst add_test --batch wperf --concurrency 128 --distribute 1:1 --from clients --to servers brw write size=1M Client Read RPC/s: 17697.3333333333 Client Write RPC/s: 8854.44444444445 Client Read MiB/s: 1.34888888888889 Client Write MiB/s: 8850.95777777778 mlx5 without SG GAPS ---------------------------------------------------------- Running test: lst add_test --batch rperf --concurrency 32 --distribute 1:1 --from clients --to servers brw read size=1M Client Read RPC/s: 22449.5555555556 Client Write RPC/s: 11227 Client Read MiB/s: 11224.5033333333 Client Write MiB/s: 1.71222222222222 ---------------------------------------------------------- Running test: lst add_test --batch rperf --concurrency 64 --distribute 1:1 --from clients --to servers brw read size=1M Client Read RPC/s: 22308.6666666667 Client Write RPC/s: 11154.3333333333 Client Read MiB/s: 11155.7288888889 Client Write MiB/s: 1.7 ---------------------------------------------------------- Running test: lst add_test --batch rperf --concurrency 128 --distribute 1:1 --from clients --to servers brw read size=1M Client Read RPC/s: 21549.1 Client Write RPC/s: 10737.4 Client Read MiB/s: 11135.278 Client Write MiB/s: 1.638 ---------------------------------------------------------- Running test: lst add_test --batch wperf --concurrency 32 --distribute 1:1 --from clients --to servers brw write size=1M Client Read RPC/s: 22178.3333333333 Client Write RPC/s: 11090.8888888889 Client Read MiB/s: 1.69 Client Write MiB/s: 11088.7822222222 ---------------------------------------------------------- Running test: lst add_test --batch wperf --concurrency 64 --distribute 1:1 --from clients --to servers brw write size=1M Client Read RPC/s: 22198.6666666667 Client Write RPC/s: 11099.8888888889 Client Read MiB/s: 1.69111111111111 Client Write MiB/s: 11100.1666666667 ---------------------------------------------------------- Running test: lst add_test --batch wperf --concurrency 128 --distribute 1:1 --from clients --to servers brw write size=1M Client Read RPC/s: 22162.6666666667 Client Write RPC/s: 11085.7777777778 Client Read MiB/s: 1.68777777777778 Client Write MiB/s: 11083.5477777778 o2iblnd parameters: options ko2iblnd timeout=10 options ko2iblnd peer_timeout=0 options ko2iblnd keepalive=30 options ko2iblnd credits=2048 options ko2iblnd ntx=2048 options ko2iblnd peer_credits=16 options ko2iblnd concurrent_sends=16 |
| Comments |
| Comment by Peter Jones [ 19/Dec/17 ] |
|
Amir Could you please advise? Peter |
| Comment by Amir Shehata (Inactive) [ 04/Jan/18 ] |
|
Here is my proposed solution. Introduce a tunable to enable GAP support. it defaults to 0 in order to use MEM_REG. if the tunable is set to 1 and the mlx card has gap support then use GAP support and add a warning that performance degradation is expected with this configuration. This is safe because closes any scenario where we could introduce discontigous fragments. The tunable can be turned on if in the future a layer uses LNet with discontigous gaps. |
| Comment by Gerrit Updater [ 05/Jan/18 ] |
|
Amir Shehata (amir.shehata@intel.com) uploaded a new patch: https://review.whamcloud.com/30749 |
| Comment by Ian Ziemba [ 11/Jan/18 ] |
|
Hi Amir, I tried running twice with this patch, and both times, my system has panic with the following message. [ 872.303780] LNetError: 2362:0:(o2iblnd_cb.c:991:kiblnd_check_sends_locked()) ASSERTION( conn->ibc_nsends_posted <= conn->ibc_queue_depth ) failed: [ 872.304325] LNetError: 2362:0:(o2iblnd_cb.c:991:kiblnd_check_sends_locked()) LBUG [ 872.304699] Pid: 2362, comm: kiblnd_sd_00_01 [ 872.304704] Call Trace: [ 872.304805] [<ffffffffc0a1e7ae>] libcfs_call_trace+0x4e/0x60 [libcfs] [ 872.304901] [<ffffffffc0a1e83c>] lbug_with_loc+0x4c/0xb0 [libcfs] [ 872.304967] [<ffffffffc0ab352b>] kiblnd_check_sends_locked+0xd8b/0xd90 [ko2iblnd] [ 872.305008] [<ffffffffc0524d53>] ? mlx5_ib_post_recv+0x1f3/0x240 [mlx5_ib] [ 872.305074] [<ffffffffc0ab4e06>] kiblnd_post_rx+0x156/0x4e0 [ko2iblnd] [ 872.305138] [<ffffffffc0ab536a>] kiblnd_recv+0x1da/0x7b0 [ko2iblnd] [ 872.305266] [<ffffffffc08db1fc>] ? lnet_mt_match_md+0x8c/0x1b0 [lnet] [ 872.305392] [<ffffffffc08e3573>] lnet_ni_recv+0xc3/0x320 [lnet] [ 872.305523] [<ffffffffc08e3cc1>] lnet_recv_put+0x81/0xb0 [lnet] [ 872.305641] [<ffffffffc08e5ee6>] lnet_parse_local+0x5a6/0xd40 [lnet] [ 872.305762] [<ffffffffc08e6f4a>] lnet_parse+0x8ca/0xfc0 [lnet] [ 872.305825] [<ffffffffc0ab3035>] ? kiblnd_check_sends_locked+0x895/0xd90 [ko2iblnd] [ 872.305860] [<ffffffffc051c608>] ? mlx5_ib_poll_cq+0x418/0xf10 [mlx5_ib] [ 872.305925] [<ffffffffc0ab5ce3>] kiblnd_handle_rx+0x213/0x6b0 [ko2iblnd] [ 872.305990] [<ffffffffc0abc90f>] kiblnd_scheduler+0xf0f/0x1150 [ko2iblnd] [ 872.306004] [<ffffffff810c93f5>] ? sched_clock_cpu+0x85/0xc0 [ 872.306016] [<ffffffff8102954d>] ? __switch_to+0xcd/0x500 [ 872.306031] [<ffffffff810c6440>] ? default_wake_function+0x0/0x20 [ 872.306096] [<ffffffffc0abba00>] ? kiblnd_scheduler+0x0/0x1150 [ko2iblnd] [ 872.306109] [<ffffffff810b252f>] kthread+0xcf/0xe0 [ 872.306122] [<ffffffff810b2460>] ? kthread+0x0/0xe0 [ 872.306135] [<ffffffff816b8798>] ret_from_fork+0x58/0x90 [ 872.306147] [<ffffffff810b2460>] ? kthread+0x0/0xe0 I think this panic is unrelated to your patch. Do you want me to open another ticket? Thanks |
| Comment by Amir Shehata (Inactive) [ 11/Jan/18 ] |
|
This has already been fixed already. Please checkout |
| Comment by Ian Ziemba [ 11/Jan/18 ] |
|
Thanks. The use_fastreg_gaps parameter works as expected. When set to 0, 1M performance is 11 GBps. When set to 1, 1M performance is 8.9 GBps. |
| Comment by Amir Shehata (Inactive) [ 11/Jan/18 ] |
|
Ok thanks. good to know that SG_GAPS drops performance. I'll make sure to record that on the LNet Wiki |
| Comment by Gerrit Updater [ 09/Feb/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30749/ |
| Comment by Peter Jones [ 09/Feb/18 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 12/Sep/18 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33149 |