Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2799

ldlm_cbd: This service may have more threads (192) than the given soft limit (128)

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.4.0
    • Lustre 2.4.0
    • 3
    • 6776

    Description

      I see the following message on the console of one of our Sequoia login nodes:

      2013-02-12 15:26:17 Lustre: ldlm_cbd: This service may have more threads (192) than the given soft limit (128)
      

      Is this troubling? What's the net effect of this?

      Attachments

        Activity

          [LU-2799] ldlm_cbd: This service may have more threads (192) than the given soft limit (128)

          Patch has landed

          utopiabound Nathaniel Clark added a comment - Patch has landed

          Patch to make the message a CDEBUG:
          http://review.whamcloud.com/5447

          utopiabound Nathaniel Clark added a comment - Patch to make the message a CDEBUG: http://review.whamcloud.com/5447

          I'm more inclined to just quiet the message entirely, i.e. CDEBUG(), since there isn't anything the sysadmin can or should do about it.

          I guess at some point we need to look at whether there should be one set of threads running on each of the cores, or if one set of threads per socket is enough?

          adilger Andreas Dilger added a comment - I'm more inclined to just quiet the message entirely, i.e. CDEBUG(), since there isn't anything the sysadmin can or should do about it. I guess at some point we need to look at whether there should be one set of threads running on each of the cores, or if one set of threads per socket is enough?

          Andreas, Liang,
          Should the message be changed to a CWARN instead. If there's an upper limit (even if it's soft) being passed, I would think, it should be logged.

          Prakash, Sorry about closing the bug prematurely.

          utopiabound Nathaniel Clark added a comment - Andreas, Liang, Should the message be changed to a CWARN instead. If there's an upper limit (even if it's soft) being passed, I would think, it should be logged. Prakash, Sorry about closing the bug prematurely.

          48 on this node:

          $ cat /proc/cpuinfo | grep proc | wc -l
          48
          
          prakash Prakash Surya (Inactive) added a comment - 48 on this node: $ cat /proc/cpuinfo | grep proc | wc -l 48

          Liang, any reason this message should be kept? Would it be better to limit the number of threads?

          Prakash, how many sockets/cores are on this login node?

          adilger Andreas Dilger added a comment - Liang, any reason this message should be kept? Would it be better to limit the number of threads? Prakash, how many sockets/cores are on this login node?

          Unless somebody has a convincing reason why the message should stay, I'm reopening this ticket in the hopes that it is removed.

          prakash Prakash Surya (Inactive) added a comment - Unless somebody has a convincing reason why the message should stay, I'm reopening this ticket in the hopes that it is removed.

          Thanks. It should really be removed in that case. If an administrator can safely ignore the message, it should not make it to the console.

          prakash Prakash Surya (Inactive) added a comment - Thanks. It should really be removed in that case. If an administrator can safely ignore the message, it should not make it to the console.

          The message is just informational. It will be printed if you specified a number of threads larger than 128 via ptlrpc's ldlm_num_threads module parameter, or if the number of cores is large. See lustre/include/lustre_net.h:250 "example 3" states:

          On 64-core machine with 8 partitions we will need LDLM_NTHRS_BASE(24)
          threads for each partition to keep service healthy, so total threads
          number should be 24 * 8 = 192.

          This is not harmful nor should it be worrisome.

          utopiabound Nathaniel Clark added a comment - The message is just informational. It will be printed if you specified a number of threads larger than 128 via ptlrpc's ldlm_num_threads module parameter, or if the number of cores is large. See lustre/include/lustre_net.h:250 "example 3" states: On 64-core machine with 8 partitions we will need LDLM_NTHRS_BASE(24) threads for each partition to keep service healthy, so total threads number should be 24 * 8 = 192. This is not harmful nor should it be worrisome.

          People

            utopiabound Nathaniel Clark
            prakash Prakash Surya (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: