Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2613

opening and closing file can generate 'unreclaimable slab' space

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • Lustre 2.6.0, Lustre 2.5.1
    • Lustre 2.1.3, Lustre 2.1.4
    • 3
    • 6116

      We have a lot of nodes with a large amount of unreclaimable memory (over 4GB). Whatever we try to do (manually shrinking the cache, clearing lru locks, ...) the memory can't be recovered. The only way to get the memory back is to umount the lustre filesystem.

      After some troubleshooting, I was able to wrote a small reproducer where I just open(2) then close(2) files in O_RDWR (my reproducer use to open thousand of files to emphasize the issue).

      Attached 2 programs :

      • gentree.c (cc -o gentree gentree.c -lpthread) to generate a tree of known files (no need to use readdir in reproducer.c)
      • reproducer.c (cc -o reproducer reproduver.c -lpthread) to reproduce the issue.
        The macro BASE_DIR has to be adjust according the local cluster configuration (you should provide the name of a directory located on a lustre filesystem).

      There is no link between the 2 phases as rebooting the client between gentree & reproducer does't avoid the problem. Running gentree (which open as much files as reproducer) doesn't show the issue.

        1. gentree.c
          3 kB
        2. logs_01.tar.gz
          7 kB
        3. reproducer.c
          2 kB

            hongchao.zhang Hongchao Zhang
            louveta Alexandre Louvet (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            31 Start watching this issue

              Created:
              Updated:
              Resolved: