Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4454

"Lustre: can't support CPU hotplug well now"

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.6.0, Lustre 2.5.1
    • Lustre 2.5.0, Lustre 2.4.2
    • None
    • 3
    • 12210

    Description

      We have some Lustre clients where hyperthreading is enabled and disabled, possibly on a per job basis. The admins are noting streams of scary messages on the console from Lustre:

      2013-12-02 09:58:29 LNet: 5546:0:(linux-cpu.c:1035:cfs_cpu_notify()) Lustre: can't support CPU hotplug well now, performance and stability could be impacted[CPU 40 notify: 3]
      2013-12-02 09:58:29 LNet: 5546:0:(linux-cpu.c:1035:cfs_cpu_notify()) Skipped 30 previous similar messages
      2013-12-02 09:58:29 Booting Node 0 Processor 40 APIC 0x1
      2013-12-02 09:58:30 microcode: CPU40 sig=0x206f2, pf=0x4, revision=0x37
      2013-12-02 09:58:30 platform microcode: firmware: requesting intel-ucode/06-2f-02
      2013-12-02 09:58:30 Booting Node 0 Processor 41 APIC 0x3
      

      The above message is not acceptable. Please fix.

      Further, when I went to look into how this cpu partitions code worked, I wound up mighty confused. For instance, on a node with 4 sockets and 10 codes per socket, I see this:

      /proc/sys/lnet$ cat cpu_partition_table
      0       : 0 1 2 3 4
      1       : 5 6 7 8 9
      2       : 10 11 12 13 14
      3       : 15 16 17 18 19
      4       : 20 21 22 23 24
      5       : 25 26 27 28 29
      6       : 30 31 32 33 34
      7       : 35 36 37 38 39
      

      Why are there two parititions per socket? Is this by design, or a bug?

      What is going to happen when hyperthreading is enabled, and there are 80 "cpus" suddenly available?

      Attachments

        Activity

          People

            liang Liang Zhen (Inactive)
            morrone Christopher Morrone (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: