Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1151

1.8 client using excessive slab when mounting 2.1 server

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • None
    • lustre 1.8.5.0-6chaos
    • 3
    • 9846

    Description

      We have lustre 1.8 clients that are mounting 2.1 servers. These clients have slab usage that is growing to excessive amounts, for instance 20GB in slab on a 24GB node. On a node with 20GB of slab, about 13GB was accounted for in /proc/sys/lustre/memused.

      Upon umount'ing the 2.1 filesytem, nearly all of the 20GB slab was freed very quickly.

      The vast majority of the slab usage is in the generic slabs, mostly 16k and 8k.

      I am attaching the log "sierra1150_lustre.log.txt.bz2", which was gathered by enabling full debugging and running something like this:

      lctl debug_daemon enable; umount /p/lscratchd; lctl debug_daemon disable

      The log isn't quite as large as I would have liked, so I probably need to let the daemon run longer...or maybe try to use a larger in-memory buffer, if that is possible with slab usage as it is...

      Looking at the rest of the cluster after clearing some of the worst slab problems, we still have slab usage that varies pretty widely from 1GB to 9GB across the rest of the nodes, and I suspect that it only increases with time.

      This slab does not free itself with normal memory pressure, and is causing applications to OOM, so this is a pretty big problem for our production systems at the moment.

      I don't really need a fix for 1.8, but if the problem is common with the 2.1 code, we'll definitely need a 2.1 fix.

      Our 1.8 is getting old, so perhaps there is a fix that I missed. But I didn't turn up anything obvious when I did a few searches.

      Attachments

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              morrone Christopher Morrone
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: