Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3897

Hang up in ldlm_pools_shrink under OOM

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • Lustre 2.1.3
    • None
    • 3
    • 10187

    Description

      Hi,

      Several Bull customers running lustre 2.1.x had a hang in ldlm_pools_shrink on a lustre client (login node).
      The system was hung and a dump was initiated from the the bmc, by sending a NMI.
      The dump shows there was no more activity on the system. The 12 CPUs are idle (swapper).
      A lot of processes are in page_fault(), blocked in ldlm_pools_shrink().
      I have attached the output of the "foreach bt" crash command. Let me know if you need the vmcore file.

      Each time, we can see a lot of OOM messages in the syslog of the dump files.

      This issue looks like LU-2468.

      Thanks,
      Sebastien.

      Attachments

        Activity

          People

            bobijam Zhenyu Xu
            sebastien.buisson Sebastien Buisson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: