Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1282

Lustre 2.1 client memory usage at mount is excessive

Details

    • 3
    • 4554

    Description

      The memory usage at mount time for lustre 2.1 appears to be significantly worse than under 1.8. In particular, it looks like slab-8192 usage has grown significantly.

      On 1.8 clients, the memory usage by lustre is maybe 1GB of memory to mount four of our filesystems.

      On 2.1 clients, the memory usage has jumped to 5GB of memory to mount the same four filesystems.

      It looks like there are 3144 oscs at this time.

      The memory pretty clearly increases with each filesystem mounted, and then reduces again at each umount. I would suspect that we have some bad new per-osc memory usage or something along those lines, or otherwise there would be more fallout.

      But this is a pretty significant loss of memory, and it means that our applications are now OOMing on the 2.1 clients. Many of the applications are very specifically tuned in their memory usage, and the loss 4GB of memory per node is quite a problem.

      Attachments

        Issue Links

          Activity

            [LU-1282] Lustre 2.1 client memory usage at mount is excessive

            http://review.whamcloud.com/3026

            LU-1282 lprocfs: free stat memory as it is cleared

            Write "clear" into proc stats entries to free the stat memory
            occupation.

            Sorry, I missed this one. Can you explain to me how that is multi-thread safe? It looks to me like anyone using those memory allocations while this one thread is freeing them will cause a kernel crash.

            morrone Christopher Morrone (Inactive) added a comment - http://review.whamcloud.com/3026 LU-1282 lprocfs: free stat memory as it is cleared Write "clear" into proc stats entries to free the stat memory occupation. Sorry, I missed this one. Can you explain to me how that is multi-thread safe? It looks to me like anyone using those memory allocations while this one thread is freeing them will cause a kernel crash.
            bobijam Zhenyu Xu added a comment -

            b2_1 patch port tracking at http://review.whamcloud.com/3208

            bobijam Zhenyu Xu added a comment - b2_1 patch port tracking at http://review.whamcloud.com/3208
            pjones Peter Jones added a comment -

            ok I will reopen again if need be

            pjones Peter Jones added a comment - ok I will reopen again if need be
            pjones Peter Jones added a comment -

            This latest patch has now been landed to master. Can this ticket now be closed?

            pjones Peter Jones added a comment - This latest patch has now been landed to master. Can this ticket now be closed?
            bobijam Zhenyu Xu added a comment -

            http://review.whamcloud.com/3026

            LU-1282 lprocfs: free stat memory as it is cleared

            Write "clear" into proc stats entries to free the stat memory
            occupation.

            bobijam Zhenyu Xu added a comment - http://review.whamcloud.com/3026 LU-1282 lprocfs: free stat memory as it is cleared Write "clear" into proc stats entries to free the stat memory occupation.
            pjones Peter Jones added a comment -

            ok Chris

            pjones Peter Jones added a comment - ok Chris

            I think we need to reopen this. The work-around that we are using to disable per-cpu stats structures on clients is, we have found, limiting the client performance quite badly (down to 500MB/s or so). That is still probably better than using gigs of RAM at mount time for most of our users, but a better fix is needed.

            The fix that Whamcloud made reduces the static memory usage to closer to 1.8 usage levels on systems that over-report "possible cpus", but even those levels are quite excessive.

            I think we really do need to dynamically allocate and free (when stats are zeroed, which we do after every job is run) all stats structures. Either that or we need to switch to only collecting aggregate stats in a scalable fashion rather than trying to do it on a per-OSC basis.

            morrone Christopher Morrone (Inactive) added a comment - I think we need to reopen this. The work-around that we are using to disable per-cpu stats structures on clients is, we have found, limiting the client performance quite badly (down to 500MB/s or so). That is still probably better than using gigs of RAM at mount time for most of our users, but a better fix is needed. The fix that Whamcloud made reduces the static memory usage to closer to 1.8 usage levels on systems that over-report "possible cpus", but even those levels are quite excessive. I think we really do need to dynamically allocate and free (when stats are zeroed, which we do after every job is run) all stats structures. Either that or we need to switch to only collecting aggregate stats in a scalable fashion rather than trying to do it on a per-OSC basis.
            pjones Peter Jones added a comment -

            Landed for 2.1.2 and 2.3

            pjones Peter Jones added a comment - Landed for 2.1.2 and 2.3

            Yes, either http://review.whamcloud.com/#change,2451 or http://review.whamcloud.com/#change,2578.

            It would be nice to finally finish change 2451 and close this bug.

            morrone Christopher Morrone (Inactive) added a comment - Yes, either http://review.whamcloud.com/#change,2451 or http://review.whamcloud.com/#change,2578 . It would be nice to finally finish change 2451 and close this bug.
            pjones Peter Jones added a comment -

            By "this" do you mean this - http://review.whamcloud.com/#change,2451?

            pjones Peter Jones added a comment - By "this" do you mean this - http://review.whamcloud.com/#change,2451?

            Either this or Ned's patch should really land soon since that other LU-1282 patch already landed. Any one using that new tunable (e.g. us on Sequoia...) is going to hit client deadlocks until the fix is landed.

            morrone Christopher Morrone (Inactive) added a comment - - edited Either this or Ned's patch should really land soon since that other LU-1282 patch already landed. Any one using that new tunable (e.g. us on Sequoia...) is going to hit client deadlocks until the fix is landed.

            People

              bobijam Zhenyu Xu
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: