Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1282

Lustre 2.1 client memory usage at mount is excessive

Details

    • 3
    • 4554

    Description

      The memory usage at mount time for lustre 2.1 appears to be significantly worse than under 1.8. In particular, it looks like slab-8192 usage has grown significantly.

      On 1.8 clients, the memory usage by lustre is maybe 1GB of memory to mount four of our filesystems.

      On 2.1 clients, the memory usage has jumped to 5GB of memory to mount the same four filesystems.

      It looks like there are 3144 oscs at this time.

      The memory pretty clearly increases with each filesystem mounted, and then reduces again at each umount. I would suspect that we have some bad new per-osc memory usage or something along those lines, or otherwise there would be more fallout.

      But this is a pretty significant loss of memory, and it means that our applications are now OOMing on the 2.1 clients. Many of the applications are very specifically tuned in their memory usage, and the loss 4GB of memory per node is quite a problem.

      Attachments

        Issue Links

          Activity

            [LU-1282] Lustre 2.1 client memory usage at mount is excessive
            bobijam Zhenyu Xu added a comment - - edited

            http://review.whamcloud.com/3240

            LU-1282 lprocfs: disable some client percpu stats data

            • Client collect OPC stats through their mgc/mdc/osc obd device,
              it's unnecessary to use percpu stats on client side.
            • Revert clear stats patch committed on 8c831cb8, it is not multi-
              thread safe.
            • Should protect the change of lprocfs_stats::ls_biggest_alloc_num

            the major percpu data allocation locates at ptlrpc_lprocfs_register(), which asks for (EXTRA_MAX_OPCODES+LUSTRE_MAX_OPCODES) items for each cpu block, which is 5(PTLRPC op#)+20(OST op#)+21(MDS op#)+6(LDLM op#)+6(MGS op#)+3(OBD op#)+9(LLOG op#)+3(SEC op#)+1(SEQ op#)+1(FLD op#) = 75

            this patch disables this stats been allocated for each cpu on the client side, while this could possibly affect client's performance.

            I think we could reduce lprocfs_counter, just keep its common data to another lprocfs_counter_header structure, they are essentially the same for all cpu per each type of counter.

            bobijam Zhenyu Xu added a comment - - edited http://review.whamcloud.com/3240 LU-1282 lprocfs: disable some client percpu stats data Client collect OPC stats through their mgc/mdc/osc obd device, it's unnecessary to use percpu stats on client side. Revert clear stats patch committed on 8c831cb8, it is not multi- thread safe. Should protect the change of lprocfs_stats::ls_biggest_alloc_num the major percpu data allocation locates at ptlrpc_lprocfs_register(), which asks for (EXTRA_MAX_OPCODES+LUSTRE_MAX_OPCODES) items for each cpu block, which is 5(PTLRPC op#)+20(OST op#)+21(MDS op#)+6(LDLM op#)+6(MGS op#)+3(OBD op#)+9(LLOG op#)+3(SEC op#)+1(SEQ op#)+1(FLD op#) = 75 this patch disables this stats been allocated for each cpu on the client side, while this could possibly affect client's performance. I think we could reduce lprocfs_counter, just keep its common data to another lprocfs_counter_header structure, they are essentially the same for all cpu per each type of counter.
            bobijam Zhenyu Xu added a comment -

            Christopher,

            you are right, it's not multi-thread safe to free stat memory this way.

            bobijam Zhenyu Xu added a comment - Christopher, you are right, it's not multi-thread safe to free stat memory this way.

            b2_1 patch port tracking at http://review.whamcloud.com/3208

            Looks like the patch for b2_1 was already in progress here:

            http://review.whamcloud.com/2578

            morrone Christopher Morrone (Inactive) added a comment - b2_1 patch port tracking at http://review.whamcloud.com/3208 Looks like the patch for b2_1 was already in progress here: http://review.whamcloud.com/2578

            http://review.whamcloud.com/3026

            LU-1282 lprocfs: free stat memory as it is cleared

            Write "clear" into proc stats entries to free the stat memory
            occupation.

            Sorry, I missed this one. Can you explain to me how that is multi-thread safe? It looks to me like anyone using those memory allocations while this one thread is freeing them will cause a kernel crash.

            morrone Christopher Morrone (Inactive) added a comment - http://review.whamcloud.com/3026 LU-1282 lprocfs: free stat memory as it is cleared Write "clear" into proc stats entries to free the stat memory occupation. Sorry, I missed this one. Can you explain to me how that is multi-thread safe? It looks to me like anyone using those memory allocations while this one thread is freeing them will cause a kernel crash.
            bobijam Zhenyu Xu added a comment -

            b2_1 patch port tracking at http://review.whamcloud.com/3208

            bobijam Zhenyu Xu added a comment - b2_1 patch port tracking at http://review.whamcloud.com/3208
            pjones Peter Jones added a comment -

            ok I will reopen again if need be

            pjones Peter Jones added a comment - ok I will reopen again if need be
            pjones Peter Jones added a comment -

            This latest patch has now been landed to master. Can this ticket now be closed?

            pjones Peter Jones added a comment - This latest patch has now been landed to master. Can this ticket now be closed?
            bobijam Zhenyu Xu added a comment -

            http://review.whamcloud.com/3026

            LU-1282 lprocfs: free stat memory as it is cleared

            Write "clear" into proc stats entries to free the stat memory
            occupation.

            bobijam Zhenyu Xu added a comment - http://review.whamcloud.com/3026 LU-1282 lprocfs: free stat memory as it is cleared Write "clear" into proc stats entries to free the stat memory occupation.
            pjones Peter Jones added a comment -

            ok Chris

            pjones Peter Jones added a comment - ok Chris

            I think we need to reopen this. The work-around that we are using to disable per-cpu stats structures on clients is, we have found, limiting the client performance quite badly (down to 500MB/s or so). That is still probably better than using gigs of RAM at mount time for most of our users, but a better fix is needed.

            The fix that Whamcloud made reduces the static memory usage to closer to 1.8 usage levels on systems that over-report "possible cpus", but even those levels are quite excessive.

            I think we really do need to dynamically allocate and free (when stats are zeroed, which we do after every job is run) all stats structures. Either that or we need to switch to only collecting aggregate stats in a scalable fashion rather than trying to do it on a per-OSC basis.

            morrone Christopher Morrone (Inactive) added a comment - I think we need to reopen this. The work-around that we are using to disable per-cpu stats structures on clients is, we have found, limiting the client performance quite badly (down to 500MB/s or so). That is still probably better than using gigs of RAM at mount time for most of our users, but a better fix is needed. The fix that Whamcloud made reduces the static memory usage to closer to 1.8 usage levels on systems that over-report "possible cpus", but even those levels are quite excessive. I think we really do need to dynamically allocate and free (when stats are zeroed, which we do after every job is run) all stats structures. Either that or we need to switch to only collecting aggregate stats in a scalable fashion rather than trying to do it on a per-OSC basis.
            pjones Peter Jones added a comment -

            Landed for 2.1.2 and 2.3

            pjones Peter Jones added a comment - Landed for 2.1.2 and 2.3

            People

              bobijam Zhenyu Xu
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: