Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.10.0
    • Lustre 2.8.0
    • 3
    • 9223372036854775807

    Description

      nrs_tbf_*_startup should free hash table if futher process fails.

      Attachments

        Issue Links

          Activity

            [LU-7441] Memory leak in nrs_tbf_*_startup

            Issue here is resolved and landed for 2.10.0. The code cleanup patch is moved to a separate ticket: LU-9750.

            jgmitter Joseph Gmitter (Inactive) added a comment - Issue here is resolved and landed for 2.10.0. The code cleanup patch is moved to a separate ticket: LU-9750 .

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/17224/
            Subject: LU-7441 nrs: Free hash table if failed to start a nrs policy
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: cd362fa9186a3e4de34c7c68908e6d3d429bb087

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/17224/ Subject: LU-7441 nrs: Free hash table if failed to start a nrs policy Project: fs/lustre-release Branch: master Current Patch Set: Commit: cd362fa9186a3e4de34c7c68908e6d3d429bb087

            Emoly Liu (emoly.liu@intel.com) uploaded a new patch: https://review.whamcloud.com/25319
            Subject: LU-7441 nrs: some code cleanup in NRS policies
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 740df75fe6952bada2b505a7bd751aff09a07e94

            gerrit Gerrit Updater added a comment - Emoly Liu (emoly.liu@intel.com) uploaded a new patch: https://review.whamcloud.com/25319 Subject: LU-7441 nrs: some code cleanup in NRS policies Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 740df75fe6952bada2b505a7bd751aff09a07e94

            I don't think it is NRS state setting problem. If all funcitons of op_policy_start() always free all memory they allocats when failure happends, no memory will be leaked. And I checked all nrs_*_start functions. I didn't find any problem.

            lixi Li Xi (Inactive) added a comment - I don't think it is NRS state setting problem. If all funcitons of op_policy_start() always free all memory they allocats when failure happends, no memory will be leaked. And I checked all nrs_*_start functions. I didn't find any problem.
            emoly.liu Emoly Liu added a comment -

            Lixi, thanks for your reproducer!

            As we discussed, this problem exists in all NRS policies code, not only TBF, so I think it's related to NRS state setting. Your patch does work, but we need to know the root cause and then fix it.

            emoly.liu Emoly Liu added a comment - Lixi, thanks for your reproducer! As we discussed, this problem exists in all NRS policies code, not only TBF, so I think it's related to NRS state setting. Your patch does work, but we need to know the root cause and then fix it.

            This problem is easy to produce:

            With following patch:

            memset(&start, 0, sizeof(start));
            start.tc_jobids_str = "*";

            start.tc_rpc_rate = tbf_rate;
            start.tc_rule_flags = NTRS_DEFAULT;
            start.tc_name = NRS_TBF_DEFAULT_RULE;
            INIT_LIST_HEAD(&start.tc_jobids);
            //rc = nrs_tbf_rule_start(policy, head, &start);
            rc = -EINVAL;
            return rc;

            [root@QYJ tests]# lctl set_param ost.OSS.ost_io.nrs_policies="tbf jobid"
            ost.OSS.ost_io.nrs_policies=tbf jobid
            error: set_param: setting /proc/fs/lustre/ost/OSS/ost_io/nrs_policies=tbf jobid: Invalid argument
            [root@QYJ tests]# sh llmountcleanup.sh
            Stopping clients: QYJ /mnt/lustre (opts:-f)
            Stopping client QYJ /mnt/lustre opts:-f
            Stopping clients: QYJ /mnt/lustre2 (opts:-f)
            Stopping /mnt/mds1 (opts:-f) on QYJ
            Stopping /mnt/ost1 (opts:-f) on QYJ
            Stopping /mnt/ost2 (opts:-f) on QYJ

            LNetError: 19587:0:(module.c:412:exit_libcfs_module()) Portals memory leaked: 131600 bytes
            Memory leaks detected

            lixi Li Xi (Inactive) added a comment - This problem is easy to produce: With following patch: memset(&start, 0, sizeof(start)); start.tc_jobids_str = "*"; start.tc_rpc_rate = tbf_rate; start.tc_rule_flags = NTRS_DEFAULT; start.tc_name = NRS_TBF_DEFAULT_RULE; INIT_LIST_HEAD(&start.tc_jobids); //rc = nrs_tbf_rule_start(policy, head, &start); rc = -EINVAL; return rc; [root@QYJ tests] # lctl set_param ost.OSS.ost_io.nrs_policies="tbf jobid" ost.OSS.ost_io.nrs_policies=tbf jobid error: set_param: setting /proc/fs/lustre/ost/OSS/ost_io/nrs_policies=tbf jobid: Invalid argument [root@QYJ tests] # sh llmountcleanup.sh Stopping clients: QYJ /mnt/lustre (opts:-f) Stopping client QYJ /mnt/lustre opts:-f Stopping clients: QYJ /mnt/lustre2 (opts:-f) Stopping /mnt/mds1 (opts:-f) on QYJ Stopping /mnt/ost1 (opts:-f) on QYJ Stopping /mnt/ost2 (opts:-f) on QYJ LNetError: 19587:0:(module.c:412:exit_libcfs_module()) Portals memory leaked: 131600 bytes Memory leaks detected
            pjones Peter Jones added a comment -

            Emoly

            Could you please take care of this patch?

            Thanks

            Peter

            pjones Peter Jones added a comment - Emoly Could you please take care of this patch? Thanks Peter

            Li Xi (lixi@ddn.com) uploaded a new patch: http://review.whamcloud.com/17224
            Subject: LU-7441 nrs: fix memory leak in nrs_tbf_*_startup
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: f1b173201d137a8acd924084d03a76d2bdcab0a1

            gerrit Gerrit Updater added a comment - Li Xi (lixi@ddn.com) uploaded a new patch: http://review.whamcloud.com/17224 Subject: LU-7441 nrs: fix memory leak in nrs_tbf_*_startup Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: f1b173201d137a8acd924084d03a76d2bdcab0a1

            People

              emoly.liu Emoly Liu
              lixi Li Xi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: