<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:48:49 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-5134] Add option to lctl set_param for setting parameters in parallel</title>
                <link>https://jira.whamcloud.com/browse/LU-5134</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;lctl set_param should have an option to set a parameter across multiple matched files in parallel.  For instance, if you execute this lctl set_param command:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl set_param [parallel-option] ldlm.namespaces.*osc*.lru_size=clear
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;it should write &quot;clear&quot; to the files matching the given parameter pattern in parallel.&lt;/p&gt;

&lt;p&gt;This enhancement is required to speed up clearing of Lustre caches.  When there are many OSTs, executing&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl set_param ldlm.namespaces.*.lru_size=clear
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;takes a long time, and there is no reason that the lru_size files can&apos;t be written to in parallel.  Then the work can be done on each OST in parallel.&lt;/p&gt;

&lt;p&gt;For example, with 16 OSTs, it takes 5.4 seconds to clear caches across all namespaces.  This could be sped up by parallelizing the write to lru_size across the namespaces.&lt;/p&gt;

&lt;p&gt;If this enhancement is added, then &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3970&quot; title=&quot;Add procfs interface for clearing lustre caches in parallel&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3970&quot;&gt;&lt;del&gt;LU-3970&lt;/del&gt;&lt;/a&gt; can also be resolved.&lt;/p&gt;</description>
                <environment></environment>
        <key id="24991">LU-5134</key>
            <summary>Add option to lctl set_param for setting parameters in parallel</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="haasken">Ryan Haasken</assignee>
                                    <reporter username="haasken">Ryan Haasken</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Mon, 2 Jun 2014 16:15:33 +0000</created>
                <updated>Wed, 25 Oct 2023 19:50:16 +0000</updated>
                            <resolved>Wed, 25 Oct 2023 19:50:16 +0000</resolved>
                                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="85487" author="simmonsja" created="Mon, 2 Jun 2014 18:00:44 +0000"  >&lt;p&gt;I was just discussing this issue on &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5030&quot; title=&quot;&amp;quot;lctl {get,set}_param&amp;quot; should also check in /sys/fs/{lnet,lustre}&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5030&quot;&gt;&lt;del&gt;LU-5030&lt;/del&gt;&lt;/a&gt;.  Besides lru_size we have the same issue with reading all imports. See sanityn test 35 for a example. This problem is blocking us from moving the test suite completely to using lctl &lt;span class=&quot;error&quot;&gt;&amp;#91;g/s&amp;#93;&lt;/span&gt;et_param. This would be a most useful piece of work. I recommend that you would work off of patch &lt;a href=&quot;http://review.whamcloud.com/#/c/10300&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/10300&lt;/a&gt;. I still need to fix the patch up but it at a good start.&lt;/p&gt;</comment>
                            <comment id="85498" author="haasken" created="Mon, 2 Jun 2014 18:53:23 +0000"  >&lt;p&gt;Here is a potential implementation of a parallel lctl set_param:  &lt;a href=&quot;http://review.whamcloud.com/#/c/10555/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/10555/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I am looking for feedback on that patch.  Specifically, I am wondering if it&apos;s really necessary to check for HAVE_LIBPTHREAD everywhere since it makes the code a little messy.  I tried to minimize the number of places where I checked &quot;#ifdef HAVE_LIBPTHREAD&quot;, but it&apos;s still a little ugly.  Is portability to platforms without libpthread still a concern?&lt;/p&gt;

&lt;p&gt;Second of all, I am wondering if it is really necessary to have an option to enable the parallel set_param, or it should just be parallel by default.  Does anybody have any input?&lt;/p&gt;</comment>
                            <comment id="85499" author="haasken" created="Mon, 2 Jun 2014 18:54:25 +0000"  >&lt;p&gt;James, I didn&apos;t see your first comment before posting my patch.  I&apos;ll take a look.&lt;/p&gt;</comment>
                            <comment id="85757" author="haasken" created="Wed, 4 Jun 2014 22:19:35 +0000"  >&lt;p&gt;James, how does &lt;a href=&quot;http://review.whamcloud.com/#/c/10300&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/10300&lt;/a&gt; relate to my change?  Those changes are in get_param functions in liblustreapi.c, which, as far as I can tell, are not used by lctl.  Is the idea to change lctl to start using the liblustreapi functions to do getting and setting of parameters?&lt;/p&gt;</comment>
                            <comment id="85769" author="haasken" created="Wed, 4 Jun 2014 23:17:20 +0000"  >&lt;p&gt;I&apos;ve figured out the answer to the question in my previous comment.  That does appear to be the case based on Andreas&apos; comment on &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5030&quot; title=&quot;&amp;quot;lctl {get,set}_param&amp;quot; should also check in /sys/fs/{lnet,lustre}&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5030&quot;&gt;&lt;del&gt;LU-5030&lt;/del&gt;&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt; Ideally, this would eventually result in usable llapi_get_param() and llapi_set_param() functions which can be used by lctl, lfs, and other applications that hide the details of the interface and the location of the files in /proc or /sys or /debugfs or whatever &lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;It looks like there is still a lot of work to be done there, and I&apos;m not sure how to approach it yet.&lt;/p&gt;

&lt;p&gt;James, can you please explain the problem in sanityn.sh test_35?  I don&apos;t see the problem there and how it relates to this enhancement.&lt;/p&gt;</comment>
                            <comment id="86030" author="simmonsja" created="Fri, 6 Jun 2014 16:47:36 +0000"  >&lt;p&gt;Yes much work is left to be done. I spent yesterday updating the patch for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5030&quot; title=&quot;&amp;quot;lctl {get,set}_param&amp;quot; should also check in /sys/fs/{lnet,lustre}&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5030&quot;&gt;&lt;del&gt;LU-5030&lt;/del&gt;&lt;/a&gt;. I think I know what you want (get_param and set_param) so I&apos;m going to work that in.&lt;/p&gt;

&lt;p&gt;Test sanityn.sh 35 is another loop through the proc file system to gather import data much like the sanity 900 test. Thinking about it a really nice feature would to get_param with filters. So if you only get results back for a specific value.&lt;/p&gt;</comment>
                            <comment id="141279" author="haasken" created="Thu, 4 Feb 2016 23:43:08 +0000"  >&lt;p&gt;I&apos;ve attached a quick test script which demonstrates the performance difference between parallel and serial set_param when canceling unused locks across many namespaces by writing to lru_size.  It also includes other functional tests of &lt;tt&gt;lctl set_param -p&lt;/tt&gt; that I used as I was developing.  I&apos;m hoping there already exist enough tests in sanity.sh and others that will verify the {set,get,list}_param functionality.&lt;/p&gt;

&lt;p&gt;I&apos;ve also attached sample output from the test script showing the results of running it on a VM.  In that sample output, a serial set_param took 4.175 seconds, while a parallel set_param took 0.401 seconds.&lt;/p&gt;</comment>
                            <comment id="141566" author="haasken" created="Tue, 9 Feb 2016 00:08:28 +0000"  >&lt;p&gt;Per John&apos;s comment on the gerrit change, I wrote a bash function to do accomplish a set_param in parallel:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;function bash_cancel_locks() {
    declare -a pids
    local pid_idx=0
    local rc=0
    local wait_rc=0
    local namespaces=$(lctl list_param ldlm.namespaces.*.lru_size)

    &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; namespace in $namespaces; &lt;span class=&quot;code-keyword&quot;&gt;do&lt;/span&gt;
        lctl set_param $namespace=clear &amp;amp;
        pids[$pid_idx]=$!
        ((pid_idx++))
    done
    &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; i in $(seq $pid_idx); &lt;span class=&quot;code-keyword&quot;&gt;do&lt;/span&gt;
        wait ${pids[$i]}
        wait_rc=$?
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; [[ $wait_rc -ne 0 ]]; then
            rc=$wait_rc
        fi
    done
    &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; $rc
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In my testing on a VM with many mounts, this function achieves about 50-100% of the performance of the &lt;tt&gt;lctl set_param -p&lt;/tt&gt;, but performance varies a lot.  This is pretty good, but we would still like a single interface to set general Lustre parameters (including lru_size) in parallel.  &lt;tt&gt;lctl&lt;/tt&gt; seems like the most appropriate place to do this, although the implementation in C is more complicated than the above implementation in bash.&lt;/p&gt;

&lt;p&gt;I think the closeness in performance is due to the fact that the bash implementation has no artifical limit to the number of subprocesses it spawns while the &lt;tt&gt;lctl set_param -p&lt;/tt&gt; implementation limits itself to 8 threads per core.  I&apos;m going to bump that up to 32 threads per core and see how it performs.  I&apos;d also like to do a performance comparison on real hardware with a file system with many more OSTs.&lt;/p&gt;</comment>
                            <comment id="141568" author="haasken" created="Tue, 9 Feb 2016 00:15:01 +0000"  >&lt;p&gt;The performance of the &lt;tt&gt;lctl set_param -p&lt;/tt&gt; seems to improve when I increase &lt;tt&gt;LCTL_THREADS_PER_CPU&lt;/tt&gt; from 8 to 32.  The times are not that consistent for any of the methods between runs of the tests though.&lt;/p&gt;</comment>
                            <comment id="141571" author="simmonsja" created="Tue, 9 Feb 2016 01:11:11 +0000"  >&lt;p&gt;Perhaps you should the number of threads as a option for the user, -p 32 or something along those lines.&lt;/p&gt;</comment>
                            <comment id="390574" author="gerrit" created="Wed, 25 Oct 2023 18:04:05 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/10555/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/10555/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5134&quot; title=&quot;Add option to lctl set_param for setting parameters in parallel&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5134&quot;&gt;&lt;del&gt;LU-5134&lt;/del&gt;&lt;/a&gt; utils: Add parallel option to lctl set_param&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 345a2497d08f6b9afd74ed0188a70489f7a43e5d&lt;/p&gt;</comment>
                            <comment id="390610" author="pjones" created="Wed, 25 Oct 2023 19:50:16 +0000"  >&lt;p&gt;Landed for 2.16&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="21007">LU-3970</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="20323" name="test-output.txt" size="7123" author="haasken" created="Thu, 4 Feb 2016 23:43:08 +0000"/>
                            <attachment id="20324" name="test.sh" size="6359" author="haasken" created="Thu, 4 Feb 2016 23:43:08 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10030" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic/Theme</customfieldname>
                        <customfieldvalues>
                                        <label>Performance</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwnkf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>14163</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>