[LU-4454] "Lustre: can't support CPU hotplug well now" Created: 08/Jan/14  Updated: 14/Feb/14  Resolved: 21/Jan/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0, Lustre 2.4.2
Fix Version/s: Lustre 2.6.0, Lustre 2.5.1

Type: Bug Priority: Major
Reporter: Christopher Morrone Assignee: Liang Zhen (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 12210

 Description   

We have some Lustre clients where hyperthreading is enabled and disabled, possibly on a per job basis. The admins are noting streams of scary messages on the console from Lustre:

2013-12-02 09:58:29 LNet: 5546:0:(linux-cpu.c:1035:cfs_cpu_notify()) Lustre: can't support CPU hotplug well now, performance and stability could be impacted[CPU 40 notify: 3]
2013-12-02 09:58:29 LNet: 5546:0:(linux-cpu.c:1035:cfs_cpu_notify()) Skipped 30 previous similar messages
2013-12-02 09:58:29 Booting Node 0 Processor 40 APIC 0x1
2013-12-02 09:58:30 microcode: CPU40 sig=0x206f2, pf=0x4, revision=0x37
2013-12-02 09:58:30 platform microcode: firmware: requesting intel-ucode/06-2f-02
2013-12-02 09:58:30 Booting Node 0 Processor 41 APIC 0x3

The above message is not acceptable. Please fix.

Further, when I went to look into how this cpu partitions code worked, I wound up mighty confused. For instance, on a node with 4 sockets and 10 codes per socket, I see this:

/proc/sys/lnet$ cat cpu_partition_table
0       : 0 1 2 3 4
1       : 5 6 7 8 9
2       : 10 11 12 13 14
3       : 15 16 17 18 19
4       : 20 21 22 23 24
5       : 25 26 27 28 29
6       : 30 31 32 33 34
7       : 35 36 37 38 39

Why are there two parititions per socket? Is this by design, or a bug?

What is going to happen when hyperthreading is enabled, and there are 80 "cpus" suddenly available?



 Comments   
Comment by Peter Jones [ 08/Jan/14 ]

Liang

Could you please advise?

Thanks

Peter

Comment by Liang Zhen (Inactive) [ 08/Jan/14 ]

Hi Chris, it is by design to have multiple partitions per socket, two different partitions (and thread pools) per socket should have better performance when there are many cores (HTs) on each socket, and it also can be set/changed by configuration.

CPU partition is not well designed for hot plug-out CPU, e.g. all CPUs (or cores) in a specific CPU partition are offline, then threads on that CPU partition just lose affinity, we can do nothing for this so far. Hot plug-in new CPU is OK, but new added CPU will never be used by CPU affinity threads.

Enabling/Disabling HT should be fine because HTs of a same core will be put in a same CPU partition:

  • if HT is disable, when enabling HT, Lustre threads with CPU affinity will never run on those new appeared "CPUs", that's it.
  • if HT is enabled, When disabling HT, Lustre threads with CPU affinity will still run on same cores.

I can work out a patch which only print warnings when we lose a physical core (all HTs in a core are gone).

Comment by Liang Zhen (Inactive) [ 08/Jan/14 ]

patch is here: http://review.whamcloud.com/#/c/8770/

Comment by Jodi Levi (Inactive) [ 16/Jan/14 ]

What else needs to be completed in this ticket and what is the priority of that work (if any)?

Comment by Peter Jones [ 21/Jan/14 ]

As per LLNL this ticket can be resolved.

Generated at Sat Feb 10 01:42:53 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.