Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.14.0
-
None
-
3
-
9223372036854775807
Description
Occasionally seen in the logs that CPU cores are being deactivated on the system:
[ 200.214057] LNet: 5394:0:(libcfs_cpu.c:1133:cfs_cpu_dead()) Lustre: can't support CPU plug-out well now, performance and stability could be impacted [CPU 32] [ 200.231635] LNet: 5394:0:(libcfs_cpu.c:1133:cfs_cpu_dead()) Lustre: can't support CPU plug-out well now, performance and stability could be impacted [CPU 33] [ 200.249606] LNet: 5394:0:(libcfs_cpu.c:1133:cfs_cpu_dead()) Lustre: can't support CPU plug-out well now, performance and stability could be impacted [CPU 34]
I suspect this is mainly a client issue, but could eventually be hit on servers in a cloud environment.
It would be good to handle this situation better than just printing an error message. In particular, stop ptlrpcd threads running on those cores if the entire CPT is removed, so they don't continue to burn cycles. Also, recompute the CPT count.
Attachments
Issue Links
- Wiki Page
-
Wiki Page Loading...