Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5134

Add option to lctl set_param for setting parameters in parallel

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None

    Description

      lctl set_param should have an option to set a parameter across multiple matched files in parallel. For instance, if you execute this lctl set_param command:

      lctl set_param [parallel-option] ldlm.namespaces.*osc*.lru_size=clear
      

      it should write "clear" to the files matching the given parameter pattern in parallel.

      This enhancement is required to speed up clearing of Lustre caches. When there are many OSTs, executing

      lctl set_param ldlm.namespaces.*.lru_size=clear
      

      takes a long time, and there is no reason that the lru_size files can't be written to in parallel. Then the work can be done on each OST in parallel.

      For example, with 16 OSTs, it takes 5.4 seconds to clear caches across all namespaces. This could be sped up by parallelizing the write to lru_size across the namespaces.

      If this enhancement is added, then LU-3970 can also be resolved.

      Attachments

        1. test.sh
          6 kB
        2. test-output.txt
          7 kB

        Issue Links

          Activity

            [LU-5134] Add option to lctl set_param for setting parameters in parallel

            I've attached a quick test script which demonstrates the performance difference between parallel and serial set_param when canceling unused locks across many namespaces by writing to lru_size. It also includes other functional tests of lctl set_param -p that I used as I was developing. I'm hoping there already exist enough tests in sanity.sh and others that will verify the {set,get,list}_param functionality.

            I've also attached sample output from the test script showing the results of running it on a VM. In that sample output, a serial set_param took 4.175 seconds, while a parallel set_param took 0.401 seconds.

            haasken Ryan Haasken added a comment - I've attached a quick test script which demonstrates the performance difference between parallel and serial set_param when canceling unused locks across many namespaces by writing to lru_size. It also includes other functional tests of lctl set_param -p that I used as I was developing. I'm hoping there already exist enough tests in sanity.sh and others that will verify the {set,get,list}_param functionality. I've also attached sample output from the test script showing the results of running it on a VM. In that sample output, a serial set_param took 4.175 seconds, while a parallel set_param took 0.401 seconds.

            Yes much work is left to be done. I spent yesterday updating the patch for LU-5030. I think I know what you want (get_param and set_param) so I'm going to work that in.

            Test sanityn.sh 35 is another loop through the proc file system to gather import data much like the sanity 900 test. Thinking about it a really nice feature would to get_param with filters. So if you only get results back for a specific value.

            simmonsja James A Simmons added a comment - Yes much work is left to be done. I spent yesterday updating the patch for LU-5030 . I think I know what you want (get_param and set_param) so I'm going to work that in. Test sanityn.sh 35 is another loop through the proc file system to gather import data much like the sanity 900 test. Thinking about it a really nice feature would to get_param with filters. So if you only get results back for a specific value.

            I've figured out the answer to the question in my previous comment. That does appear to be the case based on Andreas' comment on LU-5030:

            Ideally, this would eventually result in usable llapi_get_param() and llapi_set_param() functions which can be used by lctl, lfs, and other applications that hide the details of the interface and the location of the files in /proc or /sys or /debugfs or whatever

            It looks like there is still a lot of work to be done there, and I'm not sure how to approach it yet.

            James, can you please explain the problem in sanityn.sh test_35? I don't see the problem there and how it relates to this enhancement.

            haasken Ryan Haasken added a comment - I've figured out the answer to the question in my previous comment. That does appear to be the case based on Andreas' comment on LU-5030 : Ideally, this would eventually result in usable llapi_get_param() and llapi_set_param() functions which can be used by lctl, lfs, and other applications that hide the details of the interface and the location of the files in /proc or /sys or /debugfs or whatever It looks like there is still a lot of work to be done there, and I'm not sure how to approach it yet. James, can you please explain the problem in sanityn.sh test_35? I don't see the problem there and how it relates to this enhancement.

            James, how does http://review.whamcloud.com/#/c/10300 relate to my change? Those changes are in get_param functions in liblustreapi.c, which, as far as I can tell, are not used by lctl. Is the idea to change lctl to start using the liblustreapi functions to do getting and setting of parameters?

            haasken Ryan Haasken added a comment - James, how does http://review.whamcloud.com/#/c/10300 relate to my change? Those changes are in get_param functions in liblustreapi.c, which, as far as I can tell, are not used by lctl. Is the idea to change lctl to start using the liblustreapi functions to do getting and setting of parameters?
            haasken Ryan Haasken added a comment -

            James, I didn't see your first comment before posting my patch. I'll take a look.

            haasken Ryan Haasken added a comment - James, I didn't see your first comment before posting my patch. I'll take a look.
            haasken Ryan Haasken added a comment -

            Here is a potential implementation of a parallel lctl set_param: http://review.whamcloud.com/#/c/10555/

            I am looking for feedback on that patch. Specifically, I am wondering if it's really necessary to check for HAVE_LIBPTHREAD everywhere since it makes the code a little messy. I tried to minimize the number of places where I checked "#ifdef HAVE_LIBPTHREAD", but it's still a little ugly. Is portability to platforms without libpthread still a concern?

            Second of all, I am wondering if it is really necessary to have an option to enable the parallel set_param, or it should just be parallel by default. Does anybody have any input?

            haasken Ryan Haasken added a comment - Here is a potential implementation of a parallel lctl set_param: http://review.whamcloud.com/#/c/10555/ I am looking for feedback on that patch. Specifically, I am wondering if it's really necessary to check for HAVE_LIBPTHREAD everywhere since it makes the code a little messy. I tried to minimize the number of places where I checked "#ifdef HAVE_LIBPTHREAD", but it's still a little ugly. Is portability to platforms without libpthread still a concern? Second of all, I am wondering if it is really necessary to have an option to enable the parallel set_param, or it should just be parallel by default. Does anybody have any input?
            simmonsja James A Simmons added a comment - - edited

            I was just discussing this issue on LU-5030. Besides lru_size we have the same issue with reading all imports. See sanityn test 35 for a example. This problem is blocking us from moving the test suite completely to using lctl [g/s]et_param. This would be a most useful piece of work. I recommend that you would work off of patch http://review.whamcloud.com/#/c/10300. I still need to fix the patch up but it at a good start.

            simmonsja James A Simmons added a comment - - edited I was just discussing this issue on LU-5030 . Besides lru_size we have the same issue with reading all imports. See sanityn test 35 for a example. This problem is blocking us from moving the test suite completely to using lctl [g/s] et_param. This would be a most useful piece of work. I recommend that you would work off of patch http://review.whamcloud.com/#/c/10300 . I still need to fix the patch up but it at a good start.

            People

              haasken Ryan Haasken
              haasken Ryan Haasken
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: