Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1150

1.8 client using excessive slab when mounting 2.1 server

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 1.8.x (1.8.0 - 1.8.5)
    • None
    • Lustre 1.8.5.0-6chaos
    • 3
    • 6443

    Description

      We have found that our 1.8 clients are using excessive ammounts of slab now that they are mounting 2.1 server filesystems. The memory is not lost, per say, because a umount of the offending filesystem will result in the memory being freed and the node returning to normal operation.

      But we have seen slab usage grow to as much as 20GB on a node with only 24GB of ram. This is all in the generic slabs (mostly 16k, and 8k), not in a lustre named slab. Buffers and cache are necessarily nearly non-existant when the slab usage reaches these numbers.

      Our 1.8 is a bit old, so if you know of a fix that is in newer 1.8 versions let me know. I tried searching a bit, but didn't find anything that looked promising. Really, I am not too concerned about fixing 1.8, but I think we need to know enough about the problem to figure out if 2.1 will have the same issue or not.

      I enabled full lustre debugging, and used the debug daemon to collect logs while unmounting. Something like this:

      lctl debug_daemon enable; umount /p/lscratchd; lctl debug_daemon disable

      The log isn't as long as I would have liked, so I probably need to let it run longer before disabling. But

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              morrone Christopher Morrone
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: