Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1212

On MDS startup upon client connection mdt_xx threads Consume All Available CPU

    Details

    • Severity:
      3
    • Rank (Obsolete):
      4687

      Description

      Found during IR testing at ORNL.

      On MDS startup soon after clients start hitting it, all mdt_xx threads are starting to use all cpu there is.

      we tried to sysrq-t and all of them are in grow_rqbd
      I checked the code and as soon as the thread is in that state, there is a unbreakable loop, that does 64*numonlinecpus(=16) = 1024 allocations of 16k in size.

      the condition to enter there is racy the num posted rqbds < nbuf_group/2
      so if 1000 of them would enter there at one time, we have 1000 threads doing 1024 of those allocations

      we have kdump log, but it still needs to be transported.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                liang Liang Zhen (Inactive)
                Reporter:
                ian Ian Colle (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: