Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5134

Add option to lctl set_param for setting parameters in parallel

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None

    Description

      lctl set_param should have an option to set a parameter across multiple matched files in parallel. For instance, if you execute this lctl set_param command:

      lctl set_param [parallel-option] ldlm.namespaces.*osc*.lru_size=clear
      

      it should write "clear" to the files matching the given parameter pattern in parallel.

      This enhancement is required to speed up clearing of Lustre caches. When there are many OSTs, executing

      lctl set_param ldlm.namespaces.*.lru_size=clear
      

      takes a long time, and there is no reason that the lru_size files can't be written to in parallel. Then the work can be done on each OST in parallel.

      For example, with 16 OSTs, it takes 5.4 seconds to clear caches across all namespaces. This could be sped up by parallelizing the write to lru_size across the namespaces.

      If this enhancement is added, then LU-3970 can also be resolved.

      Attachments

        1. test.sh
          6 kB
        2. test-output.txt
          7 kB

        Issue Links

          Activity

            [LU-5134] Add option to lctl set_param for setting parameters in parallel
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/10555/
            Subject: LU-5134 utils: Add parallel option to lctl set_param
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 345a2497d08f6b9afd74ed0188a70489f7a43e5d

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/10555/ Subject: LU-5134 utils: Add parallel option to lctl set_param Project: fs/lustre-release Branch: master Current Patch Set: Commit: 345a2497d08f6b9afd74ed0188a70489f7a43e5d

            Perhaps you should the number of threads as a option for the user, -p 32 or something along those lines.

            simmonsja James A Simmons added a comment - Perhaps you should the number of threads as a option for the user, -p 32 or something along those lines.

            The performance of the lctl set_param -p seems to improve when I increase LCTL_THREADS_PER_CPU from 8 to 32. The times are not that consistent for any of the methods between runs of the tests though.

            haasken Ryan Haasken added a comment - The performance of the lctl set_param -p seems to improve when I increase LCTL_THREADS_PER_CPU from 8 to 32. The times are not that consistent for any of the methods between runs of the tests though.

            Per John's comment on the gerrit change, I wrote a bash function to do accomplish a set_param in parallel:

            function bash_cancel_locks() {
                declare -a pids
                local pid_idx=0
                local rc=0
                local wait_rc=0
                local namespaces=$(lctl list_param ldlm.namespaces.*.lru_size)
            
                for namespace in $namespaces; do
                    lctl set_param $namespace=clear &
                    pids[$pid_idx]=$!
                    ((pid_idx++))
                done
                for i in $(seq $pid_idx); do
                    wait ${pids[$i]}
                    wait_rc=$?
                    if [[ $wait_rc -ne 0 ]]; then
                        rc=$wait_rc
                    fi
                done
                return $rc
            }
            

            In my testing on a VM with many mounts, this function achieves about 50-100% of the performance of the lctl set_param -p, but performance varies a lot. This is pretty good, but we would still like a single interface to set general Lustre parameters (including lru_size) in parallel. lctl seems like the most appropriate place to do this, although the implementation in C is more complicated than the above implementation in bash.

            I think the closeness in performance is due to the fact that the bash implementation has no artifical limit to the number of subprocesses it spawns while the lctl set_param -p implementation limits itself to 8 threads per core. I'm going to bump that up to 32 threads per core and see how it performs. I'd also like to do a performance comparison on real hardware with a file system with many more OSTs.

            haasken Ryan Haasken added a comment - Per John's comment on the gerrit change, I wrote a bash function to do accomplish a set_param in parallel: function bash_cancel_locks() { declare -a pids local pid_idx=0 local rc=0 local wait_rc=0 local namespaces=$(lctl list_param ldlm.namespaces.*.lru_size) for namespace in $namespaces; do lctl set_param $namespace=clear & pids[$pid_idx]=$! ((pid_idx++)) done for i in $(seq $pid_idx); do wait ${pids[$i]} wait_rc=$? if [[ $wait_rc -ne 0 ]]; then rc=$wait_rc fi done return $rc } In my testing on a VM with many mounts, this function achieves about 50-100% of the performance of the lctl set_param -p , but performance varies a lot. This is pretty good, but we would still like a single interface to set general Lustre parameters (including lru_size) in parallel. lctl seems like the most appropriate place to do this, although the implementation in C is more complicated than the above implementation in bash. I think the closeness in performance is due to the fact that the bash implementation has no artifical limit to the number of subprocesses it spawns while the lctl set_param -p implementation limits itself to 8 threads per core. I'm going to bump that up to 32 threads per core and see how it performs. I'd also like to do a performance comparison on real hardware with a file system with many more OSTs.

            I've attached a quick test script which demonstrates the performance difference between parallel and serial set_param when canceling unused locks across many namespaces by writing to lru_size. It also includes other functional tests of lctl set_param -p that I used as I was developing. I'm hoping there already exist enough tests in sanity.sh and others that will verify the {set,get,list}_param functionality.

            I've also attached sample output from the test script showing the results of running it on a VM. In that sample output, a serial set_param took 4.175 seconds, while a parallel set_param took 0.401 seconds.

            haasken Ryan Haasken added a comment - I've attached a quick test script which demonstrates the performance difference between parallel and serial set_param when canceling unused locks across many namespaces by writing to lru_size. It also includes other functional tests of lctl set_param -p that I used as I was developing. I'm hoping there already exist enough tests in sanity.sh and others that will verify the {set,get,list}_param functionality. I've also attached sample output from the test script showing the results of running it on a VM. In that sample output, a serial set_param took 4.175 seconds, while a parallel set_param took 0.401 seconds.

            Yes much work is left to be done. I spent yesterday updating the patch for LU-5030. I think I know what you want (get_param and set_param) so I'm going to work that in.

            Test sanityn.sh 35 is another loop through the proc file system to gather import data much like the sanity 900 test. Thinking about it a really nice feature would to get_param with filters. So if you only get results back for a specific value.

            simmonsja James A Simmons added a comment - Yes much work is left to be done. I spent yesterday updating the patch for LU-5030 . I think I know what you want (get_param and set_param) so I'm going to work that in. Test sanityn.sh 35 is another loop through the proc file system to gather import data much like the sanity 900 test. Thinking about it a really nice feature would to get_param with filters. So if you only get results back for a specific value.

            I've figured out the answer to the question in my previous comment. That does appear to be the case based on Andreas' comment on LU-5030:

            Ideally, this would eventually result in usable llapi_get_param() and llapi_set_param() functions which can be used by lctl, lfs, and other applications that hide the details of the interface and the location of the files in /proc or /sys or /debugfs or whatever

            It looks like there is still a lot of work to be done there, and I'm not sure how to approach it yet.

            James, can you please explain the problem in sanityn.sh test_35? I don't see the problem there and how it relates to this enhancement.

            haasken Ryan Haasken added a comment - I've figured out the answer to the question in my previous comment. That does appear to be the case based on Andreas' comment on LU-5030 : Ideally, this would eventually result in usable llapi_get_param() and llapi_set_param() functions which can be used by lctl, lfs, and other applications that hide the details of the interface and the location of the files in /proc or /sys or /debugfs or whatever It looks like there is still a lot of work to be done there, and I'm not sure how to approach it yet. James, can you please explain the problem in sanityn.sh test_35? I don't see the problem there and how it relates to this enhancement.

            James, how does http://review.whamcloud.com/#/c/10300 relate to my change? Those changes are in get_param functions in liblustreapi.c, which, as far as I can tell, are not used by lctl. Is the idea to change lctl to start using the liblustreapi functions to do getting and setting of parameters?

            haasken Ryan Haasken added a comment - James, how does http://review.whamcloud.com/#/c/10300 relate to my change? Those changes are in get_param functions in liblustreapi.c, which, as far as I can tell, are not used by lctl. Is the idea to change lctl to start using the liblustreapi functions to do getting and setting of parameters?
            haasken Ryan Haasken added a comment -

            James, I didn't see your first comment before posting my patch. I'll take a look.

            haasken Ryan Haasken added a comment - James, I didn't see your first comment before posting my patch. I'll take a look.

            People

              haasken Ryan Haasken
              haasken Ryan Haasken
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: