[LU-10245] lnetctl --cpt does not assosiate cpt properly Created: 15/Nov/17 Updated: 01/Dec/17 Resolved: 01/Dec/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Elena Gryaznova | Assignee: | WC Triage |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
[root@fre805 tests]# cat /sys/devices/system/cpu/online
0-1
[root@fre805 tests]# lnetctl net show --verbose
net:
- net type: lo
local NI(s):
- nid: 0@lo
status: up
statistics:
send_count: 0
recv_count: 0
drop_count: 0
tunables:
peer_timeout: 0
peer_credits: 0
peer_buffer_credits: 0
credits: 0
lnd tunables:
tcp bonding: 0
dev cpt: 0
CPT: "[0]"
- net type: tcp
local NI(s):
- nid: 192.168.108.5@tcp
status: up
interfaces:
0: eth0
statistics:
send_count: 276723
recv_count: 316655
drop_count: 0
tunables:
peer_timeout: 180
peer_credits: 8
peer_buffer_credits: 0
credits: 256
lnd tunables:
tcp bonding: 0
dev cpt: -1
CPT: "[0]"
- nid: 192.168.118.5@tcp
status: up
interfaces:
0: eth1
statistics:
send_count: 0
recv_count: 0
drop_count: 0
tunables:
peer_timeout: 180
peer_credits: 8
peer_buffer_credits: 0
credits: 256
lnd tunables:
tcp bonding: 0
dev cpt: -1
CPT: "[0]"
[root@fre805 tests]#
[root@fre805 tests]# lnetctl net add --net tcp --if eth2 --cpt [0, 1]
[root@fre805 tests]# lnetctl net show --verbose
net:
- net type: lo
local NI(s):
- nid: 0@lo
status: up
statistics:
send_count: 0
recv_count: 0
drop_count: 0
tunables:
peer_timeout: 0
peer_credits: 0
peer_buffer_credits: 0
credits: 0
lnd tunables:
tcp bonding: 0
dev cpt: 0
CPT: "[0]"
- net type: tcp
local NI(s):
- nid: 192.168.108.5@tcp
status: up
interfaces:
0: eth0
statistics:
send_count: 276843
recv_count: 316775
drop_count: 0
tunables:
peer_timeout: 180
peer_credits: 8
peer_buffer_credits: 0
credits: 256
lnd tunables:
tcp bonding: 0
dev cpt: -1
CPT: "[0]"
- nid: 192.168.118.5@tcp
status: up
interfaces:
0: eth1
statistics:
send_count: 0
recv_count: 0
drop_count: 0
tunables:
peer_timeout: 180
peer_credits: 8
peer_buffer_credits: 0
credits: 256
lnd tunables:
tcp bonding: 0
dev cpt: -1
CPT: "[0]"
- nid: 192.168.128.5@tcp
status: up
interfaces:
0: eth2
statistics:
send_count: 0
recv_count: 0
drop_count: 0
tunables:
peer_timeout: 180
peer_credits: 8
peer_buffer_credits: 0
credits: 256
lnd tunables:
tcp bonding: 0
dev cpt: -1
CPT: "[0]"
[root@fre805 tests]#
nid 192.168.128.5@tcp is not associated with CPT [1] |
| Comments |
| Comment by Amir Shehata (Inactive) [ 15/Nov/17 ] |
|
Take a look at your libcfs cpu partitions. Just because you have 2 CPUs doesn't directly mean that you'll endup with two CPTs. Both CPUs can be part of the same NUMA and by default libcfs cpu partitions will be set to "N", if you don't explicitly set it to something else. "N" means to use the NUMA architecture. IE: create a CPT per NUMA node which has at least one CPU attached to it. From the output you shared above, I think that's the issue, since when you configure without explicitly specifying a cpt option, the network gets attached to CPT 0, which just means that you only have one CPT partition in your system. to get your test to work you might need to add the following line in your lustre.conf: options libcfs cpu_pattern="0[0], 1[1]"
That'll create two CPTs. CPT 0 will have CPU 0 and CPT 1 will have CPU 1. The lnetctl syntax you used should then work. |
| Comment by Elena Gryaznova [ 01/Dec/17 ] |
|
Amir, options libcfs cpu_pattern="0[0] 1[1]"
|
| Comment by Elena Gryaznova [ 01/Dec/17 ] |
|
Ticket can be closed. |
| Comment by Peter Jones [ 01/Dec/17 ] |
|
thanks Elena |