Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2795

WRF runs causing Lustre clients to lose memory

Details

    • Bug
    • Resolution: Not a Bug
    • Minor
    • None
    • None
    • None
    • 3
    • 6768

    Description

      At our center, we are running a Lustre 2.1.2 file system with Lustre 2.1.2 clients on all of the compute nodes of our Penguin cluster. Recently, a user has been performing WRF runs where he uses a special feature of WRF to offload all of the I/O onto a single node, which improves his I/O performance dramatically, but results in the node losing ~1 GB of memory to "Inactive" after each run. In our epilogue, we have a script checking for available free memory above a specified percentage, and every job that this user runs results in the node being set to offline due to this 1 GB of Inactive memory.

      Here is an example of the output from drop_caches showing before and after the epilogue starts on one of these nodes:

      1. Before:
      2. MemTotal: 15.681 GB
      3. MemFree: 6.495 GB
      4. Cached: 6.206 GB
      5. Active: 1.395 GB
      6. Inactive: 6.247 GB
      7. Dirty: 0.000 GB
      8. Mapped: 0.003 GB
      9. Slab: 1.391 GB
      10. After:
      11. MemTotal: 15.681 GB
      12. MemFree: 14.003 GB
      13. Cached: 0.007 GB
      14. Active: 0.134 GB
      15. Inactive: 1.309 GB
      16. Dirty: 0.000 GB
      17. Mapped: 0.003 GB
      18. Slab: 0.082 GB

      While looking for possible solutions to this problem, I stumbled upon a recent HPDD-Discuss question that was entitled "Possible file page leak in Lustre 2.1.2" which was very similar to our problem. It was suggested that the issue had already been discovered and resolved in http://jira.whamcloud.com/browse/LU-1576

      This ticket suggests that the resolution was included as part of Lustre 2.1.3, so we tested this by installing the Lustre 2.1.3 client packages on some of our compute nodes and allowing the WRF job to run on these nodes. However, even after the upgrade to Lustre 2.1.3, we still saw the inactive memory at the end of the job. Do we need to upgrade our Lustre installation on the OSSes and MDS to Lustre 2.1.3 to fix this problem, or do you have any other suggestions?

      Any help that you could provide us with would be appreciated!

      Attachments

        Activity

          [LU-2795] WRF runs causing Lustre clients to lose memory

          Closing this old ticket.

          Just because memory is not "Free" doesn't mean that it is "leaked". The kernel will cache pages even if they are unused, until all of the free memory is consumed, and then old data will be freed.

          The main concern would be if the node actually runs out of memory and applications start failing (OOM killer, or -ENOMEM=-12 memory allocation errors).

          adilger Andreas Dilger added a comment - Closing this old ticket. Just because memory is not "Free" doesn't mean that it is "leaked". The kernel will cache pages even if they are unused, until all of the free memory is consumed, and then old data will be freed. The main concern would be if the node actually runs out of memory and applications start failing (OOM killer, or -ENOMEM=-12 memory allocation errors).
          green Oleg Drokin added a comment -

          So, just to reconfirm, when you run this app several times on the same client, it adds 1G more of inactive data, right, so eventually the node will die with OOM?
          If you unmount lustre fs after the run on this client and then mount it back instead of the reboot, is the memory reclaimed?

          I would nto put too much into the leaks you see reported as those are useless unless you take the reading after unmount since every bit of memory allocated but not yet freed because it is in use will show as leaked.

          green Oleg Drokin added a comment - So, just to reconfirm, when you run this app several times on the same client, it adds 1G more of inactive data, right, so eventually the node will die with OOM? If you unmount lustre fs after the run on this client and then mount it back instead of the reboot, is the memory reclaimed? I would nto put too much into the leaks you see reported as those are useless unless you take the reading after unmount since every bit of memory allocated but not yet freed because it is in use will show as leaked.

          Thank you, I am engaging further engineering resources now.

          cliffw Cliff White (Inactive) added a comment - Thank you, I am engaging further engineering resources now.
          adizon Archie Dizon added a comment -

          NOTE: The debug log is to large to attach to this case. Here is a link instead.

          https://www.dropbox.com/s/vwuklioioytcl7e/lustre_debug

          Thanks

          adizon Archie Dizon added a comment - NOTE: The debug log is to large to attach to this case. Here is a link instead. https://www.dropbox.com/s/vwuklioioytcl7e/lustre_debug Thanks
          adizon Archie Dizon added a comment -

          Customer ran there WRF job with the Lustre debugging set to gather malloc information, and it does appear that we have found a leak in Lustre. Here were the steps we followed:
          1) sudo lctl set_param debug=+malloc
          2) sudo lctl set_param debug_mb=512
          3) * let the WRF job run *
          Epilogue sets the node offline (1.62 GB of memory set to inactive)
          4) sudo lctl dk /tmp/lustre_debug
          5) perl leak_finder.pl /tmp/lustre_debug 2>&1 | grep "Leak"

          From that last command, here is what we found:

              • Leak: 1080 bytes allocated at ffff8101d5eae140 (super25.c:ll_alloc_inode:56, debug file line 1745506)
              • Leak: 104 bytes allocated at ffff8101cecdbdc0 (dcache.c:ll_set_dd:192, debug file line 1745508)
              • Leak: 1080 bytes allocated at ffff810214eb3ac0 (super25.c:ll_alloc_inode:56, debug file line 1745551)
              • Leak: 104 bytes allocated at ffff8101d7523840 (dcache.c:ll_set_dd:192, debug file line 1745553)

          The Lustre documentation states that this is a cyclical log so I would imagine that if there was a small leak shown here, that throughout the run small amounts of memory could have been lost, resulting in our overall large leak.

          We will attach the lustre_debug log to this case for you to analyze as well. It does look as though we may be narrowing down on the problem now though.

          Additionally, I am going to attach the /proc/slabinfo for the end of the
          WRF run as you had previously requested, along with the /proc/meminfo
          before, during, and after the WRF run.

          cat /proc/slabinfo
          slabinfo - version: 2.1

          1. name <active_objs> <num_objs> <objsize> <objperslab>
            <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata
            <active_slabs> <num_slabs> <sharedavail>
            ll_qunit_cache 0 0 112 34 1 : tunables 120 60 8
            : slabdata 0 0 0
            lmv_objects 0 0 96 40 1 : tunables 120 60 8
            : slabdata 0 0 0
            ccc_req_kmem 0 0 40 92 1 : tunables 120 60 8
            : slabdata 0 0 0
            ccc_session_kmem 105 132 176 22 1 : tunables 120 60 8
            : slabdata 6 6 0
            ccc_thread_kmem 112 121 336 11 1 : tunables 54 27 8
            : slabdata 11 11 0
            ccc_object_kmem 0 0 256 15 1 : tunables 120 60 8
            : slabdata 0 0 0
            ccc_lock_kmem 0 0 40 92 1 : tunables 120 60 8
            : slabdata 0 0 0
            vvp_session_kmem 105 148 104 37 1 : tunables 120 60 8
            : slabdata 4 4 0
            vvp_thread_kmem 112 126 440 9 1 : tunables 54 27 8
            : slabdata 14 14 0
            vvp_page_kmem 0 0 80 48 1 : tunables 120 60 8
            : slabdata 0 0 0
            ll_rmtperm_hash_cache 0 0 256 15 1 : tunables 120 60
            8 : slabdata 0 0 0
            ll_remote_perm_cache 0 0 40 92 1 : tunables 120 60
            8 : slabdata 0 0 0
            ll_file_data 0 0 192 20 1 : tunables 120 60 8
            : slabdata 0 0 0
            lustre_inode_cache 3 21 1088 7 2 : tunables 24 12 8
            : slabdata 3 3 0
            lov_oinfo 0 0 320 12 1 : tunables 54 27 8
            : slabdata 0 0 0
            lov_lock_link_kmem 0 0 32 112 1 : tunables 120 60 8
            : slabdata 0 0 0
            lovsub_req_kmem 0 0 40 92 1 : tunables 120 60 8
            : slabdata 0 0 0
            lovsub_object_kmem 0 0 240 16 1 : tunables 120 60 8
            : slabdata 0 0 0
            lovsub_lock_kmem 0 0 64 59 1 : tunables 120 60 8
            : slabdata 0 0 0
            lovsub_page_kmem 0 0 40 92 1 : tunables 120 60 8
            : slabdata 0 0 0
            lov_req_kmem 0 0 40 92 1 : tunables 120 60 8
            : slabdata 0 0 0
            lov_session_kmem 105 120 384 10 1 : tunables 54 27 8
            : slabdata 12 12 0
            lov_thread_kmem 112 121 336 11 1 : tunables 54 27 8
            : slabdata 11 11 0
            lov_object_kmem 0 0 200 19 1 : tunables 120 60 8
            : slabdata 0 0 0
            lov_lock_kmem 0 0 104 37 1 : tunables 120 60 8
            : slabdata 0 0 0
            lov_page_kmem 0 0 48 77 1 : tunables 120 60 8
            : slabdata 0 0 0
            osc_req_kmem 0 0 40 92 1 : tunables 120 60 8
            : slabdata 0 0 0
            osc_session_kmem 105 130 296 13 1 : tunables 54 27 8
            : slabdata 10 10 0
            osc_thread_kmem 112 126 216 18 1 : tunables 120 60 8
            : slabdata 7 7 0
            osc_object_kmem 0 0 136 28 1 : tunables 120 60 8
            : slabdata 0 0 0
            osc_lock_kmem 0 0 184 21 1 : tunables 120 60 8
            : slabdata 0 0 0
            osc_page_kmem 0 0 264 15 1 : tunables 54 27 8
            : slabdata 0 0 0
            llcd_cache 0 0 3952 1 1 : tunables 24 12 8
            : slabdata 0 0 0
            interval_node 22 90 128 30 1 : tunables 120 60 8
            : slabdata 3 3 0
            ldlm_locks 43 63 576 7 1 : tunables 54 27 8
            : slabdata 9 9 0
            ldlm_resources 41 72 320 12 1 : tunables 54 27 8
            : slabdata 6 6 0
            cl_page_kmem 0 0 184 21 1 : tunables 120 60 8
            : slabdata 0 0 0
            cl_lock_kmem 0 0 216 18 1 : tunables 120 60 8
            : slabdata 0 0 0
            cl_env_kmem 105 132 176 22 1 : tunables 120 60 8
            : slabdata 6 6 0
            capa_cache 0 0 184 21 1 : tunables 120 60 8
            : slabdata 0 0 0
            ll_import_cache 0 0 1424 5 2 : tunables 24 12 8
            : slabdata 0 0 0
            ll_obdo_cache 0 0 208 19 1 : tunables 120 60 8
            : slabdata 0 0 0
            ll_obd_dev_cache 17 17 7048 1 2 : tunables 8 4 0
            : slabdata 17 17 0
            SDP 0 0 1792 2 1 : tunables 24 12 8
            : slabdata 0 0 0
            fib6_nodes 7 118 64 59 1 : tunables 120 60 8
            : slabdata 2 2 0
            ip6_dst_cache 7 36 320 12 1 : tunables 54 27 8
            : slabdata 3 3 0
            ndisc_cache 1 15 256 15 1 : tunables 120 60 8
            : slabdata 1 1 0
            RAWv6 11 12 960 4 1 : tunables 54 27 8
            : slabdata 3 3 0
            UDPv6 0 0 896 4 1 : tunables 54 27 8
            : slabdata 0 0 0
            tw_sock_TCPv6 0 0 192 20 1 : tunables 120 60 8
            : slabdata 0 0 0
            request_sock_TCPv6 0 0 192 20 1 : tunables 120 60 8
            : slabdata 0 0 0
            TCPv6 0 0 1728 4 2 : tunables 24 12 8
            : slabdata 0 0 0
            nfs_direct_cache 0 0 136 28 1 : tunables 120 60 8
            : slabdata 0 0 0
            nfs_write_data 36 36 832 9 2 : tunables 54 27 8
            : slabdata 4 4 0
            nfs_read_data 32 36 832 9 2 : tunables 54 27 8
            : slabdata 4 4 0
            nfs_inode_cache 123 195 1032 3 1 : tunables 24 12 8
            : slabdata 65 65 0
            nfs_page 0 0 128 30 1 : tunables 120 60 8
            : slabdata 0 0 0
            rpc_buffers 8 8 2048 2 1 : tunables 24 12 8
            : slabdata 4 4 0
            rpc_tasks 20 20 384 10 1 : tunables 54 27 8
            : slabdata 2 2 0
            rpc_inode_cache 30 30 768 5 1 : tunables 54 27 8
            : slabdata 6 6 0
            scsi_cmd_cache 5 10 384 10 1 : tunables 54 27 8
            : slabdata 1 1 2
            sgpool-128 32 32 4096 1 1 : tunables 24 12 8
            : slabdata 32 32 0
            sgpool-64 32 32 2048 2 1 : tunables 24 12 8
            : slabdata 16 16 0
            sgpool-32 32 32 1024 4 1 : tunables 54 27 8
            : slabdata 8 8 0
            sgpool-16 32 32 512 8 1 : tunables 54 27 8
            : slabdata 4 4 0
            sgpool-8 32 60 256 15 1 : tunables 120 60 8
            : slabdata 3 4 0
            scsi_io_context 0 0 112 34 1 : tunables 120 60 8
            : slabdata 0 0 0
            ib_mad 2048 2296 448 8 1 : tunables 54 27 8
            : slabdata 287 287 0
            ip_fib_alias 14 59 64 59 1 : tunables 120 60 8
            : slabdata 1 1 0
            ip_fib_hash 14 59 64 59 1 : tunables 120 60 8
            : slabdata 1 1 0
            UNIX 9 33 704 11 2 : tunables 54 27 8
            : slabdata 3 3 0
            flow_cache 0 0 128 30 1 : tunables 120 60 8
            : slabdata 0 0 0
            msi_cache 9 59 64 59 1 : tunables 120 60 8
            : slabdata 1 1 0
            cfq_ioc_pool 13 60 128 30 1 : tunables 120 60 8
            : slabdata 2 2 0
            cfq_pool 11 54 216 18 1 : tunables 120 60 8
            : slabdata 3 3 0
            crq_pool 4 96 80 48 1 : tunables 120 60 8
            : slabdata 1 2 0
            deadline_drq 0 0 80 48 1 : tunables 120 60 8
            : slabdata 0 0 0
            as_arq 0 0 96 40 1 : tunables 120 60 8
            : slabdata 0 0 0
            mqueue_inode_cache 1 4 896 4 1 : tunables 54 27 8
            : slabdata 1 1 0
            isofs_inode_cache 0 0 608 6 1 : tunables 54 27 8
            : slabdata 0 0 0
            hugetlbfs_inode_cache 1 7 576 7 1 : tunables 54 27
            8 : slabdata 1 1 0
            ext2_inode_cache 91 145 720 5 1 : tunables 54 27 8
            : slabdata 29 29 0
            ext2_xattr 0 0 88 44 1 : tunables 120 60 8
            : slabdata 0 0 0
            dnotify_cache 0 0 40 92 1 : tunables 120 60 8
            : slabdata 0 0 0
            dquot 0 0 256 15 1 : tunables 120 60 8
            : slabdata 0 0 0
            eventpoll_pwq 5 106 72 53 1 : tunables 120 60 8
            : slabdata 2 2 0
            eventpoll_epi 5 40 192 20 1 : tunables 120 60 8
            : slabdata 2 2 0
            inotify_event_cache 0 0 40 92 1 : tunables 120 60
            8 : slabdata 0 0 0
            inotify_watch_cache 0 0 72 53 1 : tunables 120 60
            8 : slabdata 0 0 0
            kioctx 0 0 320 12 1 : tunables 54 27 8
            : slabdata 0 0 0
            kiocb 0 0 256 15 1 : tunables 120 60 8
            : slabdata 0 0 0
            fasync_cache 0 0 24 144 1 : tunables 120 60 8
            : slabdata 0 0 0
            shmem_inode_cache 1360 1370 768 5 1 : tunables 54 27 8
            : slabdata 274 274 0
            posix_timers_cache 0 0 128 30 1 : tunables 120 60 8
            : slabdata 0 0 0
            uid_cache 2 30 128 30 1 : tunables 120 60 8
            : slabdata 1 1 0
            ip_mrt_cache 0 0 128 30 1 : tunables 120 60 8
            : slabdata 0 0 0
            tcp_bind_bucket 28 448 32 112 1 : tunables 120 60 8
            : slabdata 4 4 0
            inet_peer_cache 0 0 128 30 1 : tunables 120 60 8
            : slabdata 0 0 0
            secpath_cache 0 0 64 59 1 : tunables 120 60 8
            : slabdata 0 0 0
            xfrm_dst_cache 0 0 384 10 1 : tunables 54 27 8
            : slabdata 0 0 0
            ip_dst_cache 107 180 384 10 1 : tunables 54 27 8
            : slabdata 18 18 0
            arp_cache 53 75 256 15 1 : tunables 120 60 8
            : slabdata 5 5 0
            RAW 9 10 768 5 1 : tunables 54 27 8
            : slabdata 2 2 0
            UDP 10 15 768 5 1 : tunables 54 27 8
            : slabdata 3 3 0
            tw_sock_TCP 20 40 192 20 1 : tunables 120 60 8
            : slabdata 1 2 0
            request_sock_TCP 0 0 128 30 1 : tunables 120 60 8
            : slabdata 0 0 0
            TCP 32 35 1600 5 2 : tunables 24 12 8
            : slabdata 7 7 0
            blkdev_ioc 13 118 64 59 1 : tunables 120 60 8
            : slabdata 2 2 0
            blkdev_queue 17 20 1576 5 2 : tunables 24 12 8
            : slabdata 4 4 0
            blkdev_requests 7 14 272 14 1 : tunables 54 27 8
            : slabdata 1 1 2
            biovec-256 7 7 4096 1 1 : tunables 24 12 8
            : slabdata 7 7 0
            biovec-128 7 8 2048 2 1 : tunables 24 12 8
            : slabdata 4 4 0
            biovec-64 7 8 1024 4 1 : tunables 54 27 8
            : slabdata 2 2 0
            biovec-16 7 30 256 15 1 : tunables 120 60 8
            : slabdata 2 2 0
            biovec-4 7 118 64 59 1 : tunables 120 60 8
            : slabdata 2 2 0
            biovec-1 7 404 16 202 1 : tunables 120 60 8
            : slabdata 2 2 0
            bio 262 300 128 30 1 : tunables 120 60 8
            : slabdata 10 10 2
            utrace_engine_cache 0 0 64 59 1 : tunables 120 60
            8 : slabdata 0 0 0
            utrace_cache 0 0 64 59 1 : tunables 120 60 8
            : slabdata 0 0 0
            sock_inode_cache 90 108 640 6 1 : tunables 54 27 8
            : slabdata 18 18 0
            skbuff_fclone_cache 14 14 512 7 1 : tunables 54 27
            8 : slabdata 2 2 0
            skbuff_head_cache 2847 3060 256 15 1 : tunables 120 60 8
            : slabdata 204 204 0
            file_lock_cache 1 22 176 22 1 : tunables 120 60 8
            : slabdata 1 1 0
            Acpi-Operand 1848 2360 64 59 1 : tunables 120 60 8
            : slabdata 40 40 0
            Acpi-ParseExt 0 0 64 59 1 : tunables 120 60 8
            : slabdata 0 0 0
            Acpi-Parse 0 0 40 92 1 : tunables 120 60 8
            : slabdata 0 0 0
            Acpi-State 0 0 80 48 1 : tunables 120 60 8
            : slabdata 0 0 0
            Acpi-Namespace 839 896 32 112 1 : tunables 120 60 8
            : slabdata 8 8 0
            delayacct_cache 379 531 64 59 1 : tunables 120 60 8
            : slabdata 9 9 0
            taskstats_cache 19 53 72 53 1 : tunables 120 60 8
            : slabdata 1 1 0
            proc_inode_cache 146 180 592 6 1 : tunables 54 27 8
            : slabdata 30 30 0
            sigqueue 53 96 160 24 1 : tunables 120 60 8
            : slabdata 4 4 0
            radix_tree_node 9320 15316 536 7 1 : tunables 54 27 8
            : slabdata 2188 2188 0
            bdev_cache 6 12 832 4 1 : tunables 54 27 8
            : slabdata 3 3 0
            sysfs_dir_cache 5366 5412 88 44 1 : tunables 120 60 8
            : slabdata 123 123 0
            mnt_cache 42 60 256 15 1 : tunables 120 60 8
            : slabdata 4 4 0
            inode_cache 1231 1274 560 7 1 : tunables 54 27 8
            : slabdata 182 182 0
            dentry_cache 3139 4140 216 18 1 : tunables 120 60 8
            : slabdata 230 230 0
            filp 200 570 256 15 1 : tunables 120 60 8
            : slabdata 38 38 0
            names_cache 9 9 4096 1 1 : tunables 24 12 8
            : slabdata 9 9 0
            avc_node 30 106 72 53 1 : tunables 120 60 8
            : slabdata 2 2 0
            selinux_inode_security 3124 4032 80 48 1 : tunables 120 60
            8 : slabdata 84 84 0
            key_jar 4 20 192 20 1 : tunables 120 60 8
            : slabdata 1 1 0
            idr_layer_cache 199 238 528 7 1 : tunables 54 27 8
            : slabdata 34 34 0
            buffer_head 148 320 96 40 1 : tunables 120 60 8
            : slabdata 8 8 0
            mm_struct 24 32 896 4 1 : tunables 54 27 8
            : slabdata 8 8 0
            vm_area_struct 428 1430 176 22 1 : tunables 120 60 8
            : slabdata 65 65 1
            fs_cache 50 177 64 59 1 : tunables 120 60 8
            : slabdata 3 3 0
            files_cache 35 60 768 5 1 : tunables 54 27 8
            : slabdata 12 12 0
            signal_cache 367 378 832 9 2 : tunables 54 27 8
            : slabdata 42 42 0
            sighand_cache 357 360 2112 3 2 : tunables 24 12 8
            : slabdata 120 120 0
            task_struct 368 370 1920 2 1 : tunables 24 12 8
            : slabdata 185 185 0
            anon_vma 294 1008 24 144 1 : tunables 120 60 8
            : slabdata 7 7 0
            pid 393 531 64 59 1 : tunables 120 60 8
            : slabdata 9 9 0
            shared_policy_node 0 0 48 77 1 : tunables 120 60 8
            : slabdata 0 0 0
            numa_policy 72 432 24 144 1 : tunables 120 60 8
            : slabdata 3 3 0
            size-131072(DMA) 0 0 131072 1 32 : tunables 8 4 0
            : slabdata 0 0 0
            size-131072 2 2 131072 1 32 : tunables 8 4 0
            : slabdata 2 2 0
            size-65536(DMA) 0 0 65536 1 16 : tunables 8 4 0
            : slabdata 0 0 0
            size-65536 6 6 65536 1 16 : tunables 8 4 0
            : slabdata 6 6 0
            size-32768(DMA) 0 0 32768 1 8 : tunables 8 4 0
            : slabdata 0 0 0
            size-32768 7 7 32768 1 8 : tunables 8 4 0
            : slabdata 7 7 0
            size-16384(DMA) 0 0 16384 1 4 : tunables 8 4 0
            : slabdata 0 0 0
            size-16384 2070 2070 16384 1 4 : tunables 8 4 0
            : slabdata 2070 2070 0
            size-8192(DMA) 0 0 8192 1 2 : tunables 8 4 0
            : slabdata 0 0 0
            size-8192 2026 2026 8192 1 2 : tunables 8 4 0
            : slabdata 2026 2026 0
            size-4096(DMA) 0 0 4096 1 1 : tunables 24 12 8
            : slabdata 0 0 0
            size-4096 911 911 4096 1 1 : tunables 24 12 8
            : slabdata 911 911 0
            size-2048(DMA) 0 0 2048 2 1 : tunables 24 12 8
            : slabdata 0 0 0
            size-2048 1080 1120 2048 2 1 : tunables 24 12 8
            : slabdata 560 560 83
            size-1024(DMA) 0 0 1024 4 1 : tunables 54 27 8
            : slabdata 0 0 0
            size-1024 1429 1756 1024 4 1 : tunables 54 27 8
            : slabdata 439 439 83
            size-512(DMA) 0 0 512 8 1 : tunables 54 27 8
            : slabdata 0 0 0
            size-512 1607 2024 512 8 1 : tunables 54 27 8
            : slabdata 253 253 2
            size-256(DMA) 0 0 256 15 1 : tunables 120 60 8
            : slabdata 0 0 0
            size-256 3144 3495 256 15 1 : tunables 120 60 8
            : slabdata 233 233 0
            size-128(DMA) 0 0 128 30 1 : tunables 120 60 8
            : slabdata 0 0 0
            size-64(DMA) 0 0 64 59 1 : tunables 120 60 8
            : slabdata 0 0 0
            size-64 8683 22243 64 59 1 : tunables 120 60 8
            : slabdata 377 377 0
            size-32(DMA) 0 0 32 112 1 : tunables 120 60 8
            : slabdata 0 0 0
            size-128 3423 7410 128 30 1 : tunables 120 60 8
            : slabdata 247 247 1
            size-32 54883 59024 32 112 1 : tunables 120 60 8
            : slabdata 527 527 0
            kmem_cache 182 182 2688 1 1 : tunables 24 12 8
            : slabdata 182 182 0

          cat /proc/meminfo <before job begins>

          MemTotal: 16442916 kB
          MemFree: 15650204 kB
          Buffers: 200 kB
          Cached: 303428 kB
          SwapCached: 0 kB
          Active: 331180 kB
          Inactive: 206008 kB
          HighTotal: 0 kB
          HighFree: 0 kB
          LowTotal: 16442916 kB
          LowFree: 15650204 kB
          SwapTotal: 4225084 kB
          SwapFree: 4225084 kB
          Dirty: 0 kB
          Writeback: 0 kB
          AnonPages: 235168 kB
          Mapped: 8452 kB
          Slab: 77148 kB
          PageTables: 2740 kB
          NFS_Unstable: 0 kB
          Bounce: 0 kB
          CommitLimit: 12446540 kB
          Committed_AS: 857620 kB
          VmallocTotal: 34359738367 kB
          VmallocUsed: 80412 kB
          VmallocChunk: 34359657895 kB
          HugePages_Total: 0
          HugePages_Free: 0
          HugePages_Rsvd: 0
          Hugepagesize: 2048 kB

          cat /proc/meminfo <during the WRF run>
          MemTotal: 16442916 kB
          MemFree: 360168 kB
          Buffers: 160 kB
          Cached: 5678292 kB
          SwapCached: 2230028 kB
          Active: 8199640 kB
          Inactive: 6118380 kB
          HighTotal: 0 kB
          HighFree: 0 kB
          LowTotal: 16442916 kB
          LowFree: 360168 kB
          SwapTotal: 4225084 kB
          SwapFree: 1557472 kB
          Dirty: 9940 kB
          Writeback: 7072 kB
          AnonPages: 6545728 kB
          Mapped: 9612 kB
          Slab: 1349160 kB
          PageTables: 19704 kB
          NFS_Unstable: 0 kB
          Bounce: 0 kB
          CommitLimit: 12446540 kB
          Committed_AS: 9478760 kB
          VmallocTotal: 34359738367 kB
          VmallocUsed: 80412 kB
          VmallocChunk: 34359657895 kB
          HugePages_Total: 0
          HugePages_Free: 0
          HugePages_Rsvd: 0
          Hugepagesize: 2048 kB

          cat /proc/meminfo <after the WRF run>
          MemTotal: 16442916 kB
          MemFree: 14180204 kB
          Buffers: 208 kB
          Cached: 14928 kB
          SwapCached: 1788636 kB
          Active: 122928 kB
          Inactive: 1682064 kB
          HighTotal: 0 kB
          HighFree: 0 kB
          LowTotal: 16442916 kB
          LowFree: 14180204 kB
          SwapTotal: 4225084 kB
          SwapFree: 2213112 kB
          Dirty: 68 kB
          Writeback: 0 kB
          AnonPages: 26628 kB
          Mapped: 2668 kB
          Slab: 86572 kB
          PageTables: 672 kB
          NFS_Unstable: 0 kB
          Bounce: 0 kB
          CommitLimit: 12446540 kB
          Committed_AS: 279580 kB
          VmallocTotal: 34359738367 kB
          VmallocUsed: 80412 kB
          VmallocChunk: 34359657895 kB
          HugePages_Total: 0
          HugePages_Free: 0
          HugePages_Rsvd: 0
          Hugepagesize: 2048 kB

          adizon Archie Dizon added a comment - Customer ran there WRF job with the Lustre debugging set to gather malloc information, and it does appear that we have found a leak in Lustre. Here were the steps we followed: 1) sudo lctl set_param debug=+malloc 2) sudo lctl set_param debug_mb=512 3) * let the WRF job run * Epilogue sets the node offline (1.62 GB of memory set to inactive) 4) sudo lctl dk /tmp/lustre_debug 5) perl leak_finder.pl /tmp/lustre_debug 2>&1 | grep "Leak" From that last command, here is what we found: Leak: 1080 bytes allocated at ffff8101d5eae140 (super25.c:ll_alloc_inode:56, debug file line 1745506) Leak: 104 bytes allocated at ffff8101cecdbdc0 (dcache.c:ll_set_dd:192, debug file line 1745508) Leak: 1080 bytes allocated at ffff810214eb3ac0 (super25.c:ll_alloc_inode:56, debug file line 1745551) Leak: 104 bytes allocated at ffff8101d7523840 (dcache.c:ll_set_dd:192, debug file line 1745553) The Lustre documentation states that this is a cyclical log so I would imagine that if there was a small leak shown here, that throughout the run small amounts of memory could have been lost, resulting in our overall large leak. We will attach the lustre_debug log to this case for you to analyze as well. It does look as though we may be narrowing down on the problem now though. Additionally, I am going to attach the /proc/slabinfo for the end of the WRF run as you had previously requested, along with the /proc/meminfo before, during, and after the WRF run. cat /proc/slabinfo slabinfo - version: 2.1 name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> ll_qunit_cache 0 0 112 34 1 : tunables 120 60 8 : slabdata 0 0 0 lmv_objects 0 0 96 40 1 : tunables 120 60 8 : slabdata 0 0 0 ccc_req_kmem 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 ccc_session_kmem 105 132 176 22 1 : tunables 120 60 8 : slabdata 6 6 0 ccc_thread_kmem 112 121 336 11 1 : tunables 54 27 8 : slabdata 11 11 0 ccc_object_kmem 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 ccc_lock_kmem 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 vvp_session_kmem 105 148 104 37 1 : tunables 120 60 8 : slabdata 4 4 0 vvp_thread_kmem 112 126 440 9 1 : tunables 54 27 8 : slabdata 14 14 0 vvp_page_kmem 0 0 80 48 1 : tunables 120 60 8 : slabdata 0 0 0 ll_rmtperm_hash_cache 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 ll_remote_perm_cache 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 ll_file_data 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 0 lustre_inode_cache 3 21 1088 7 2 : tunables 24 12 8 : slabdata 3 3 0 lov_oinfo 0 0 320 12 1 : tunables 54 27 8 : slabdata 0 0 0 lov_lock_link_kmem 0 0 32 112 1 : tunables 120 60 8 : slabdata 0 0 0 lovsub_req_kmem 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 lovsub_object_kmem 0 0 240 16 1 : tunables 120 60 8 : slabdata 0 0 0 lovsub_lock_kmem 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 lovsub_page_kmem 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 lov_req_kmem 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 lov_session_kmem 105 120 384 10 1 : tunables 54 27 8 : slabdata 12 12 0 lov_thread_kmem 112 121 336 11 1 : tunables 54 27 8 : slabdata 11 11 0 lov_object_kmem 0 0 200 19 1 : tunables 120 60 8 : slabdata 0 0 0 lov_lock_kmem 0 0 104 37 1 : tunables 120 60 8 : slabdata 0 0 0 lov_page_kmem 0 0 48 77 1 : tunables 120 60 8 : slabdata 0 0 0 osc_req_kmem 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 osc_session_kmem 105 130 296 13 1 : tunables 54 27 8 : slabdata 10 10 0 osc_thread_kmem 112 126 216 18 1 : tunables 120 60 8 : slabdata 7 7 0 osc_object_kmem 0 0 136 28 1 : tunables 120 60 8 : slabdata 0 0 0 osc_lock_kmem 0 0 184 21 1 : tunables 120 60 8 : slabdata 0 0 0 osc_page_kmem 0 0 264 15 1 : tunables 54 27 8 : slabdata 0 0 0 llcd_cache 0 0 3952 1 1 : tunables 24 12 8 : slabdata 0 0 0 interval_node 22 90 128 30 1 : tunables 120 60 8 : slabdata 3 3 0 ldlm_locks 43 63 576 7 1 : tunables 54 27 8 : slabdata 9 9 0 ldlm_resources 41 72 320 12 1 : tunables 54 27 8 : slabdata 6 6 0 cl_page_kmem 0 0 184 21 1 : tunables 120 60 8 : slabdata 0 0 0 cl_lock_kmem 0 0 216 18 1 : tunables 120 60 8 : slabdata 0 0 0 cl_env_kmem 105 132 176 22 1 : tunables 120 60 8 : slabdata 6 6 0 capa_cache 0 0 184 21 1 : tunables 120 60 8 : slabdata 0 0 0 ll_import_cache 0 0 1424 5 2 : tunables 24 12 8 : slabdata 0 0 0 ll_obdo_cache 0 0 208 19 1 : tunables 120 60 8 : slabdata 0 0 0 ll_obd_dev_cache 17 17 7048 1 2 : tunables 8 4 0 : slabdata 17 17 0 SDP 0 0 1792 2 1 : tunables 24 12 8 : slabdata 0 0 0 fib6_nodes 7 118 64 59 1 : tunables 120 60 8 : slabdata 2 2 0 ip6_dst_cache 7 36 320 12 1 : tunables 54 27 8 : slabdata 3 3 0 ndisc_cache 1 15 256 15 1 : tunables 120 60 8 : slabdata 1 1 0 RAWv6 11 12 960 4 1 : tunables 54 27 8 : slabdata 3 3 0 UDPv6 0 0 896 4 1 : tunables 54 27 8 : slabdata 0 0 0 tw_sock_TCPv6 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 0 request_sock_TCPv6 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 0 TCPv6 0 0 1728 4 2 : tunables 24 12 8 : slabdata 0 0 0 nfs_direct_cache 0 0 136 28 1 : tunables 120 60 8 : slabdata 0 0 0 nfs_write_data 36 36 832 9 2 : tunables 54 27 8 : slabdata 4 4 0 nfs_read_data 32 36 832 9 2 : tunables 54 27 8 : slabdata 4 4 0 nfs_inode_cache 123 195 1032 3 1 : tunables 24 12 8 : slabdata 65 65 0 nfs_page 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 rpc_buffers 8 8 2048 2 1 : tunables 24 12 8 : slabdata 4 4 0 rpc_tasks 20 20 384 10 1 : tunables 54 27 8 : slabdata 2 2 0 rpc_inode_cache 30 30 768 5 1 : tunables 54 27 8 : slabdata 6 6 0 scsi_cmd_cache 5 10 384 10 1 : tunables 54 27 8 : slabdata 1 1 2 sgpool-128 32 32 4096 1 1 : tunables 24 12 8 : slabdata 32 32 0 sgpool-64 32 32 2048 2 1 : tunables 24 12 8 : slabdata 16 16 0 sgpool-32 32 32 1024 4 1 : tunables 54 27 8 : slabdata 8 8 0 sgpool-16 32 32 512 8 1 : tunables 54 27 8 : slabdata 4 4 0 sgpool-8 32 60 256 15 1 : tunables 120 60 8 : slabdata 3 4 0 scsi_io_context 0 0 112 34 1 : tunables 120 60 8 : slabdata 0 0 0 ib_mad 2048 2296 448 8 1 : tunables 54 27 8 : slabdata 287 287 0 ip_fib_alias 14 59 64 59 1 : tunables 120 60 8 : slabdata 1 1 0 ip_fib_hash 14 59 64 59 1 : tunables 120 60 8 : slabdata 1 1 0 UNIX 9 33 704 11 2 : tunables 54 27 8 : slabdata 3 3 0 flow_cache 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 msi_cache 9 59 64 59 1 : tunables 120 60 8 : slabdata 1 1 0 cfq_ioc_pool 13 60 128 30 1 : tunables 120 60 8 : slabdata 2 2 0 cfq_pool 11 54 216 18 1 : tunables 120 60 8 : slabdata 3 3 0 crq_pool 4 96 80 48 1 : tunables 120 60 8 : slabdata 1 2 0 deadline_drq 0 0 80 48 1 : tunables 120 60 8 : slabdata 0 0 0 as_arq 0 0 96 40 1 : tunables 120 60 8 : slabdata 0 0 0 mqueue_inode_cache 1 4 896 4 1 : tunables 54 27 8 : slabdata 1 1 0 isofs_inode_cache 0 0 608 6 1 : tunables 54 27 8 : slabdata 0 0 0 hugetlbfs_inode_cache 1 7 576 7 1 : tunables 54 27 8 : slabdata 1 1 0 ext2_inode_cache 91 145 720 5 1 : tunables 54 27 8 : slabdata 29 29 0 ext2_xattr 0 0 88 44 1 : tunables 120 60 8 : slabdata 0 0 0 dnotify_cache 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 dquot 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 eventpoll_pwq 5 106 72 53 1 : tunables 120 60 8 : slabdata 2 2 0 eventpoll_epi 5 40 192 20 1 : tunables 120 60 8 : slabdata 2 2 0 inotify_event_cache 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 inotify_watch_cache 0 0 72 53 1 : tunables 120 60 8 : slabdata 0 0 0 kioctx 0 0 320 12 1 : tunables 54 27 8 : slabdata 0 0 0 kiocb 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 fasync_cache 0 0 24 144 1 : tunables 120 60 8 : slabdata 0 0 0 shmem_inode_cache 1360 1370 768 5 1 : tunables 54 27 8 : slabdata 274 274 0 posix_timers_cache 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 uid_cache 2 30 128 30 1 : tunables 120 60 8 : slabdata 1 1 0 ip_mrt_cache 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 tcp_bind_bucket 28 448 32 112 1 : tunables 120 60 8 : slabdata 4 4 0 inet_peer_cache 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 secpath_cache 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 xfrm_dst_cache 0 0 384 10 1 : tunables 54 27 8 : slabdata 0 0 0 ip_dst_cache 107 180 384 10 1 : tunables 54 27 8 : slabdata 18 18 0 arp_cache 53 75 256 15 1 : tunables 120 60 8 : slabdata 5 5 0 RAW 9 10 768 5 1 : tunables 54 27 8 : slabdata 2 2 0 UDP 10 15 768 5 1 : tunables 54 27 8 : slabdata 3 3 0 tw_sock_TCP 20 40 192 20 1 : tunables 120 60 8 : slabdata 1 2 0 request_sock_TCP 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 TCP 32 35 1600 5 2 : tunables 24 12 8 : slabdata 7 7 0 blkdev_ioc 13 118 64 59 1 : tunables 120 60 8 : slabdata 2 2 0 blkdev_queue 17 20 1576 5 2 : tunables 24 12 8 : slabdata 4 4 0 blkdev_requests 7 14 272 14 1 : tunables 54 27 8 : slabdata 1 1 2 biovec-256 7 7 4096 1 1 : tunables 24 12 8 : slabdata 7 7 0 biovec-128 7 8 2048 2 1 : tunables 24 12 8 : slabdata 4 4 0 biovec-64 7 8 1024 4 1 : tunables 54 27 8 : slabdata 2 2 0 biovec-16 7 30 256 15 1 : tunables 120 60 8 : slabdata 2 2 0 biovec-4 7 118 64 59 1 : tunables 120 60 8 : slabdata 2 2 0 biovec-1 7 404 16 202 1 : tunables 120 60 8 : slabdata 2 2 0 bio 262 300 128 30 1 : tunables 120 60 8 : slabdata 10 10 2 utrace_engine_cache 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 utrace_cache 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 sock_inode_cache 90 108 640 6 1 : tunables 54 27 8 : slabdata 18 18 0 skbuff_fclone_cache 14 14 512 7 1 : tunables 54 27 8 : slabdata 2 2 0 skbuff_head_cache 2847 3060 256 15 1 : tunables 120 60 8 : slabdata 204 204 0 file_lock_cache 1 22 176 22 1 : tunables 120 60 8 : slabdata 1 1 0 Acpi-Operand 1848 2360 64 59 1 : tunables 120 60 8 : slabdata 40 40 0 Acpi-ParseExt 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 Acpi-Parse 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 Acpi-State 0 0 80 48 1 : tunables 120 60 8 : slabdata 0 0 0 Acpi-Namespace 839 896 32 112 1 : tunables 120 60 8 : slabdata 8 8 0 delayacct_cache 379 531 64 59 1 : tunables 120 60 8 : slabdata 9 9 0 taskstats_cache 19 53 72 53 1 : tunables 120 60 8 : slabdata 1 1 0 proc_inode_cache 146 180 592 6 1 : tunables 54 27 8 : slabdata 30 30 0 sigqueue 53 96 160 24 1 : tunables 120 60 8 : slabdata 4 4 0 radix_tree_node 9320 15316 536 7 1 : tunables 54 27 8 : slabdata 2188 2188 0 bdev_cache 6 12 832 4 1 : tunables 54 27 8 : slabdata 3 3 0 sysfs_dir_cache 5366 5412 88 44 1 : tunables 120 60 8 : slabdata 123 123 0 mnt_cache 42 60 256 15 1 : tunables 120 60 8 : slabdata 4 4 0 inode_cache 1231 1274 560 7 1 : tunables 54 27 8 : slabdata 182 182 0 dentry_cache 3139 4140 216 18 1 : tunables 120 60 8 : slabdata 230 230 0 filp 200 570 256 15 1 : tunables 120 60 8 : slabdata 38 38 0 names_cache 9 9 4096 1 1 : tunables 24 12 8 : slabdata 9 9 0 avc_node 30 106 72 53 1 : tunables 120 60 8 : slabdata 2 2 0 selinux_inode_security 3124 4032 80 48 1 : tunables 120 60 8 : slabdata 84 84 0 key_jar 4 20 192 20 1 : tunables 120 60 8 : slabdata 1 1 0 idr_layer_cache 199 238 528 7 1 : tunables 54 27 8 : slabdata 34 34 0 buffer_head 148 320 96 40 1 : tunables 120 60 8 : slabdata 8 8 0 mm_struct 24 32 896 4 1 : tunables 54 27 8 : slabdata 8 8 0 vm_area_struct 428 1430 176 22 1 : tunables 120 60 8 : slabdata 65 65 1 fs_cache 50 177 64 59 1 : tunables 120 60 8 : slabdata 3 3 0 files_cache 35 60 768 5 1 : tunables 54 27 8 : slabdata 12 12 0 signal_cache 367 378 832 9 2 : tunables 54 27 8 : slabdata 42 42 0 sighand_cache 357 360 2112 3 2 : tunables 24 12 8 : slabdata 120 120 0 task_struct 368 370 1920 2 1 : tunables 24 12 8 : slabdata 185 185 0 anon_vma 294 1008 24 144 1 : tunables 120 60 8 : slabdata 7 7 0 pid 393 531 64 59 1 : tunables 120 60 8 : slabdata 9 9 0 shared_policy_node 0 0 48 77 1 : tunables 120 60 8 : slabdata 0 0 0 numa_policy 72 432 24 144 1 : tunables 120 60 8 : slabdata 3 3 0 size-131072(DMA) 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0 size-131072 2 2 131072 1 32 : tunables 8 4 0 : slabdata 2 2 0 size-65536(DMA) 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0 0 size-65536 6 6 65536 1 16 : tunables 8 4 0 : slabdata 6 6 0 size-32768(DMA) 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 0 size-32768 7 7 32768 1 8 : tunables 8 4 0 : slabdata 7 7 0 size-16384(DMA) 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0 0 size-16384 2070 2070 16384 1 4 : tunables 8 4 0 : slabdata 2070 2070 0 size-8192(DMA) 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0 0 size-8192 2026 2026 8192 1 2 : tunables 8 4 0 : slabdata 2026 2026 0 size-4096(DMA) 0 0 4096 1 1 : tunables 24 12 8 : slabdata 0 0 0 size-4096 911 911 4096 1 1 : tunables 24 12 8 : slabdata 911 911 0 size-2048(DMA) 0 0 2048 2 1 : tunables 24 12 8 : slabdata 0 0 0 size-2048 1080 1120 2048 2 1 : tunables 24 12 8 : slabdata 560 560 83 size-1024(DMA) 0 0 1024 4 1 : tunables 54 27 8 : slabdata 0 0 0 size-1024 1429 1756 1024 4 1 : tunables 54 27 8 : slabdata 439 439 83 size-512(DMA) 0 0 512 8 1 : tunables 54 27 8 : slabdata 0 0 0 size-512 1607 2024 512 8 1 : tunables 54 27 8 : slabdata 253 253 2 size-256(DMA) 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 size-256 3144 3495 256 15 1 : tunables 120 60 8 : slabdata 233 233 0 size-128(DMA) 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 size-64(DMA) 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 size-64 8683 22243 64 59 1 : tunables 120 60 8 : slabdata 377 377 0 size-32(DMA) 0 0 32 112 1 : tunables 120 60 8 : slabdata 0 0 0 size-128 3423 7410 128 30 1 : tunables 120 60 8 : slabdata 247 247 1 size-32 54883 59024 32 112 1 : tunables 120 60 8 : slabdata 527 527 0 kmem_cache 182 182 2688 1 1 : tunables 24 12 8 : slabdata 182 182 0 cat /proc/meminfo <before job begins> MemTotal: 16442916 kB MemFree: 15650204 kB Buffers: 200 kB Cached: 303428 kB SwapCached: 0 kB Active: 331180 kB Inactive: 206008 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 16442916 kB LowFree: 15650204 kB SwapTotal: 4225084 kB SwapFree: 4225084 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 235168 kB Mapped: 8452 kB Slab: 77148 kB PageTables: 2740 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 12446540 kB Committed_AS: 857620 kB VmallocTotal: 34359738367 kB VmallocUsed: 80412 kB VmallocChunk: 34359657895 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB cat /proc/meminfo <during the WRF run> MemTotal: 16442916 kB MemFree: 360168 kB Buffers: 160 kB Cached: 5678292 kB SwapCached: 2230028 kB Active: 8199640 kB Inactive: 6118380 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 16442916 kB LowFree: 360168 kB SwapTotal: 4225084 kB SwapFree: 1557472 kB Dirty: 9940 kB Writeback: 7072 kB AnonPages: 6545728 kB Mapped: 9612 kB Slab: 1349160 kB PageTables: 19704 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 12446540 kB Committed_AS: 9478760 kB VmallocTotal: 34359738367 kB VmallocUsed: 80412 kB VmallocChunk: 34359657895 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB cat /proc/meminfo <after the WRF run> MemTotal: 16442916 kB MemFree: 14180204 kB Buffers: 208 kB Cached: 14928 kB SwapCached: 1788636 kB Active: 122928 kB Inactive: 1682064 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 16442916 kB LowFree: 14180204 kB SwapTotal: 4225084 kB SwapFree: 2213112 kB Dirty: 68 kB Writeback: 0 kB AnonPages: 26628 kB Mapped: 2668 kB Slab: 86572 kB PageTables: 672 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 12446540 kB Committed_AS: 279580 kB VmallocTotal: 34359738367 kB VmallocUsed: 80412 kB VmallocChunk: 34359657895 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB

          Thanks, let us know how it goes.

          cliffw Cliff White (Inactive) added a comment - Thanks, let us know how it goes.
          adizon Archie Dizon added a comment -

          Yes, we had tested installing 2.1.3 on a couple of our client systems to
          see if that would fix the problem, but we were still seeing the issue on
          those nodes with the Lustre 2.1.3 client installed on them. Thanks for
          clarifying that, and it doesn't appear that this code would
          be performing a great deal of readdirs, probably not the same memory leak.

          Correct, dropping cache does not free the 1 GB of memory. Our epilogue
          script attempts to drop cache twice, and after the second time it compares
          the amount of free memory before determining if it can return the compute
          node to service.

          We are going to run the WRF job with Lustre at a higher logging level and
          using the leak_finder.pl script provided by WhamCloud. We will send
          whatever we find along to you.

          adizon Archie Dizon added a comment - Yes, we had tested installing 2.1.3 on a couple of our client systems to see if that would fix the problem, but we were still seeing the issue on those nodes with the Lustre 2.1.3 client installed on them. Thanks for clarifying that, and it doesn't appear that this code would be performing a great deal of readdirs, probably not the same memory leak. Correct, dropping cache does not free the 1 GB of memory. Our epilogue script attempts to drop cache twice, and after the second time it compares the amount of free memory before determining if it can return the compute node to service. We are going to run the WRF job with Lustre at a higher logging level and using the leak_finder.pl script provided by WhamCloud. We will send whatever we find along to you.

          You indicated that you had installed 2.1.3, which contains the fix for LU-1576, this was our main indication. The LU-1576 fix mostly deals with readdir pages, so unless your workload includes a lot of readdirs your have likely a different problem.

          Are you saying that dropping cache does not free the 1GB of memory?

          cliffw Cliff White (Inactive) added a comment - You indicated that you had installed 2.1.3, which contains the fix for LU-1576 , this was our main indication. The LU-1576 fix mostly deals with readdir pages, so unless your workload includes a lot of readdirs your have likely a different problem. Are you saying that dropping cache does not free the 1GB of memory?
          adizon Archie Dizon added a comment -

          In regards to the question of waiting for a few minutes, the answer is no.
          Even if we wait for hours, the inactive memory is never given back to the
          system, we are forced to reboot these nodes to return them with their full
          memory again. However, as you can see from the output in my last message,
          we start off with > 6 GB of inactive memory at the beginning of the
          epilogue and ~ 1 GB of inactive memory after the epilogue has waited
          approximately 30 seconds. Although, no matter how long we wait, that 1 GB
          of memory is never returned to the system

          We had planned to set up a run of WRF to test the memory usage on our test
          cluster, but this has gotten delayed as all of us were busy during the
          week. We will have to wait until next week to get you some data on memory
          usage.

          Having talked with someone much more familiar with WRF and its dependencies
          than myself, it sounds like running the WRF software the way that is being
          done here, it may be a fairly big hassle. In other words, getting it
          running for you locally may be fairly difficult. We will have to see if
          going down that road is necessary when we give you some more data.

          In the meantime, I'm curious as to how WhamCloud has determined that our
          problem does not match up with http://jira.whamcloud.com/browse/LU-1576.
          The symptoms are identical, and it was suggested in the HPDD discussion
          list that this was an occurrence in Lustre 2.1.2 for some irregular I/O
          patterns. What do they see as different between our problem and the one
          described by LLNL on the list? For my future reference, I would be
          interested to know how they determined that so I could use their methods
          for better diagnosing Lustre problems in the future.

          I'll have more to share with you next week.

          Thanks

          adizon Archie Dizon added a comment - In regards to the question of waiting for a few minutes, the answer is no. Even if we wait for hours, the inactive memory is never given back to the system, we are forced to reboot these nodes to return them with their full memory again. However, as you can see from the output in my last message, we start off with > 6 GB of inactive memory at the beginning of the epilogue and ~ 1 GB of inactive memory after the epilogue has waited approximately 30 seconds. Although, no matter how long we wait, that 1 GB of memory is never returned to the system We had planned to set up a run of WRF to test the memory usage on our test cluster, but this has gotten delayed as all of us were busy during the week. We will have to wait until next week to get you some data on memory usage. Having talked with someone much more familiar with WRF and its dependencies than myself, it sounds like running the WRF software the way that is being done here, it may be a fairly big hassle. In other words, getting it running for you locally may be fairly difficult. We will have to see if going down that road is necessary when we give you some more data. In the meantime, I'm curious as to how WhamCloud has determined that our problem does not match up with http://jira.whamcloud.com/browse/LU-1576 . The symptoms are identical, and it was suggested in the HPDD discussion list that this was an occurrence in Lustre 2.1.2 for some irregular I/O patterns. What do they see as different between our problem and the one described by LLNL on the list? For my future reference, I would be interested to know how they determined that so I could use their methods for better diagnosing Lustre problems in the future. I'll have more to share with you next week. Thanks

          Can you update us on your status?

          cliffw Cliff White (Inactive) added a comment - Can you update us on your status?

          People

            green Oleg Drokin
            adizon Archie Dizon
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: