Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2613

opening and closing file can generate 'unreclaimable slab' space

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.6.0, Lustre 2.5.1
    • Lustre 2.1.3, Lustre 2.1.4
    • 3
    • 6116

    Description

      We have a lot of nodes with a large amount of unreclaimable memory (over 4GB). Whatever we try to do (manually shrinking the cache, clearing lru locks, ...) the memory can't be recovered. The only way to get the memory back is to umount the lustre filesystem.

      After some troubleshooting, I was able to wrote a small reproducer where I just open(2) then close(2) files in O_RDWR (my reproducer use to open thousand of files to emphasize the issue).

      Attached 2 programs :

      • gentree.c (cc -o gentree gentree.c -lpthread) to generate a tree of known files (no need to use readdir in reproducer.c)
      • reproducer.c (cc -o reproducer reproduver.c -lpthread) to reproduce the issue.
        The macro BASE_DIR has to be adjust according the local cluster configuration (you should provide the name of a directory located on a lustre filesystem).

      There is no link between the 2 phases as rebooting the client between gentree & reproducer does't avoid the problem. Running gentree (which open as much files as reproducer) doesn't show the issue.

      Attachments

        1. gentree.c
          3 kB
          Alexandre Louvet
        2. logs_01.tar.gz
          7 kB
          Alexandre Louvet
        3. reproducer.c
          2 kB
          Alexandre Louvet

        Issue Links

          Activity

            [LU-2613] opening and closing file can generate 'unreclaimable slab' space
            pjones Peter Jones made changes -
            Labels Original: JL New: JL mn4
            pjones Peter Jones made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            pjones Peter Jones made changes -
            Labels Original: JL mq114 New: JL
            pjones Peter Jones made changes -
            Labels Original: JL llnl mq114 New: JL mq114
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-4272 [ LU-4272 ]
            adilger Andreas Dilger made changes -
            Fix Version/s New: Lustre 2.6.0 [ 10595 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-4270 [ LU-4270 ]
            pjones Peter Jones made changes -
            Labels Original: JL llnl mq413 New: JL llnl mq114
            pjones Peter Jones made changes -
            Labels Original: JL llnl New: JL llnl mq413
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.5.1 [ 10608 ]
            Fix Version/s Original: Lustre 2.5.0 [ 10295 ]

            People

              hongchao.zhang Hongchao Zhang
              louveta Alexandre Louvet (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              31 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: