Details
-
Technical task
-
Resolution: Fixed
-
Minor
-
Lustre 2.14.0
-
None
-
a server (1 x IB-EDR) and a client (2 x IB-HDR100) and MR enabled
-
9223372036854775807
Description
If server has more than one CPT, each peer connection should be able to distributed to different CPT as a load-balancing perspective.
An decision of CPT is based on a hash function with peer NID's address, but some cases, hash returns same value and both peers went to same CPT eventually.
This causes a critical performance problem since number of CPU core belongs to each CPT and if both peers go to single CPT on server to handle, a half of CPU are alway busy and other half of CPU are idle.
Here is an example.
server# cat /sys/kernel/debug/lnet/cpu_partition_table 0 : 0 1 2 3 4 5 6 7 8 9 1 : 10 11 12 13 14 15 16 17 18 19 server# lnetctl net show net: - net type: lo local NI(s): - nid: 0@lo status: up - net type: o2ib10 local NI(s): - nid: 10.0.11.224@o2ib10 status: up interfaces: 0: ib0 client # cat /sys/kernel/debug/lnet/cpu_partition_table 0 : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 : 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 2 : 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 3 : 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 4 : 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 5 : 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 6 : 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 7 : 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 client # lnetctl net show -v - net type: o2ib10 local NI(s): - nid: 10.0.11.81@o2ib10 status: up interfaces: 0: ib0 - snip - lnd tunables: dev cpt: 0 tcp bonding: 0 CPT: "[0,1,2,3]" - nid: 10.4.11.71@o2ib10 status: up interfaces: 0: ib4 - snip - lnd tunables: dev cpt: 4 tcp bonding: 0 CPT: "[4,5,6,7]"
on client.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20263 root 20 0 0 0 0 R 98.3 0.0 0:29.85 kiblnd_sd_06_01 20264 root 20 0 0 0 0 R 98.3 0.0 0:29.85 kiblnd_sd_06_02 20265 root 20 0 0 0 0 R 98.3 0.0 0:29.85 kiblnd_sd_06_03 20262 root 20 0 0 0 0 R 98.0 0.0 0:29.84 kiblnd_sd_06_00 20247 root 20 0 0 0 0 R 89.1 0.0 1:19.11 kiblnd_sd_02_01 20248 root 20 0 0 0 0 R 88.7 0.0 1:19.20 kiblnd_sd_02_02 20249 root 20 0 0 0 0 R 88.7 0.0 1:19.15 kiblnd_sd_02_03 20246 root 20 0 0 0 0 R 87.7 0.0 1:19.24 kiblnd_sd_02_00
Two CPT are busy becouse of two interfaces.
On server
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27651 root 20 0 0 0 0 R 86.0 0.0 2:22.27 kiblnd_sd_00_00 27652 root 20 0 0 0 0 R 86.0 0.0 2:22.30 kiblnd_sd_00_01 27653 root 20 0 0 0 0 R 86.0 0.0 2:22.27 kiblnd_sd_00_02 27654 root 20 0 0 0 0 R 85.4 0.0 2:22.28 kiblnd_sd_00_03
Only an CPT is busy even for two peers are connected to server.
Amir added an debug patch and confirmed both peers went to first CPT.
00000800:00000200:18.0:1591055201.186835:0:20660:0:(o2iblnd.c:795:kiblnd_create_conn()) peer_ni = 10.0.11.81@o2ib10, ni = 10.0.11.224@o2ib10, cpt = 0 00000800:00000200:18.0:1591055201.189343:0:20660:0:(o2iblnd.c:795:kiblnd_create_conn()) peer_ni = 10.4.11.81@o2ib10, ni = 10.0.11.224@o2ib10, cpt = 0
The problem hash function retuns same value even client IP address chagned below, then both peers eventually go to same CPT on server if server has only single interface.
1407418001001297 nid1 of client 64 bit representation 1407418001263431 nid2 of client 64 bit rpresentation
Attachments
Issue Links
- is related to
-
LU-14676 Better hash distribution to different CPTs when LNET router is exist
- Resolved