<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:33:48 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-17240] change test-framework to format and mount targets in parallel </title>
                <link>https://jira.whamcloud.com/browse/LU-17240</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;It would be useful for a number of reasons to change &lt;tt&gt;test-framework.sh&lt;/tt&gt; to format and mount the MDTs and OSTs in parallel (if not both MDTs and OSTs at the same time, then at least in two sets).&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;this would reduce testing time significantly for tests that reformat the filesystem (e.g. conf-sanity)&lt;/li&gt;
	&lt;li&gt;this would improve testing of the MGS to handle registering multiple targets in parallel (there are at least some known issues with this that could be found and fixed)&lt;/li&gt;
	&lt;li&gt;this would improve test coverage since filesystems are often mounted in parallel in production, and this would better simulate the real world&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="78648">LU-17240</key>
            <summary>change test-framework to format and mount targets in parallel </summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                            <label>medium</label>
                            <label>test_script_improvements</label>
                    </labels>
                <created>Sat, 28 Oct 2023 10:26:35 +0000</created>
                <updated>Wed, 20 Dec 2023 19:42:05 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="390972" author="paf0186" created="Sat, 28 Oct 2023 18:00:38 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=timday&quot; class=&quot;user-hover&quot; rel=&quot;timday&quot;&gt;timday&lt;/a&gt; - I&apos;m just living in hope here, but maybe this would be of interest to you?&#160; It would certainly make reloading a test node faster, which would be very nice.&lt;/p&gt;</comment>
                            <comment id="390975" author="paf0186" created="Sat, 28 Oct 2023 18:22:12 +0000"  >&lt;p&gt;Andreas,&lt;/p&gt;

&lt;p&gt;Would it make sense to put &quot;unmounting&quot;/Stopping targets in parallel here under this same ticket?&#160; That&apos;s closely related and also takes a while.&lt;/p&gt;</comment>
                            <comment id="390983" author="JIRAUSER18433" created="Sun, 29 Oct 2023 23:28:32 +0000"  >&lt;p&gt;I&apos;ve actually written some tests to do a bunch of parallel mounts, but that was client-side. It was to test out OBD device registration. I never got around to cleaning the test up and submitting it.&lt;/p&gt;

&lt;p&gt;While mounting targets in parallel would make testing faster, I&apos;m not sure if it would meaningfully improve test coverage. I haven&apos;t seen/heard of issues with mounting targets in parallel (even with 100s of OSS/OST). It would useful if we could find a way to register a few hundred OSS/MDS in parallel. I think that would surface more bugs faster. I think it would go:&lt;/p&gt;

&lt;p&gt;1) Stop all clients, MDS, MDS, OSS&lt;/p&gt;

&lt;p&gt;2) Make a bunch of small temp disks in /tmp/ on each node&lt;/p&gt;

&lt;p&gt;3) Start a bunch of a services using those disks, hope nothing explodes&lt;/p&gt;

&lt;p&gt;4) Cleanup and restart services&lt;/p&gt;

&lt;p&gt;Andreas, could you link some of the known issues you mentioned (in the description) to this ticket? I&apos;m curious what people have seen go wrong.&lt;/p&gt;</comment>
                            <comment id="390993" author="adilger" created="Mon, 30 Oct 2023 04:46:15 +0000"  >&lt;p&gt;Patrick,&lt;br/&gt;
yes parallel unmounting would also be useful.  I think the formatting and mounting in parallel would be a bigger win.&lt;/p&gt;

&lt;p&gt;Tim,&lt;br/&gt;
I don&apos;t have any tickets that have details on this, since most of the time this has happened is in conjunction with some other issue that has a higher priority to fix. Basically, what I&apos;ve seen is that mounting multiple targets in parallel and registering with the MGS for the first time.  If there are problems during registration (after reformat or writeconf) the MGS thinks that it is registered but the OST does not, or similar. I think &lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=shadow&quot; class=&quot;user-hover&quot; rel=&quot;shadow&quot;&gt;shadow&lt;/a&gt; has previously submitted a patch for this to allow the OST to retry the initial connection, but I couldn&apos;t find it. &lt;/p&gt;</comment>
                            <comment id="391009" author="shadow" created="Mon, 30 Oct 2023 08:26:37 +0000"  >&lt;p&gt;Andreas,&lt;/p&gt;

&lt;p&gt;it&apos;s Zam patch I think. I have just move all handling in single thread.&lt;br/&gt;
as about ticket..&lt;br/&gt;
I don&apos;t think it&apos;s large problem - as test target is small and format rate, except a conf-sanity.&lt;/p&gt;</comment>
                            <comment id="391135" author="adilger" created="Tue, 31 Oct 2023 03:27:10 +0000"  >&lt;p&gt;Tim, it looks like the patch I was thinking about is &lt;a href=&quot;https://review.whamcloud.com/44594&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/44594&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14928&quot; title=&quot;Allow MD target re-registered after writeconf&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14928&quot;&gt;&lt;del&gt;LU-14928&lt;/del&gt;&lt;/a&gt; mgs: allow md target re-register&lt;/tt&gt;&quot; and that has been landed since 2.14.55.   That at least addresses part of the issue, though I thought there was at least one more patch in this area about re-registering targets.&lt;/p&gt;</comment>
                            <comment id="391136" author="adilger" created="Tue, 31 Oct 2023 03:35:05 +0000"  >&lt;p&gt;The other ones are patch &lt;a href=&quot;https://review.whamcloud.com/45259&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45259&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15112&quot; title=&quot;attempt to register an OST with duplicated index should fail but it does not&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15112&quot;&gt;&lt;del&gt;LU-15112&lt;/del&gt;&lt;/a&gt; mgc: do not ignore target registration failure&lt;/tt&gt;&quot; and maybe patch &lt;a href=&quot;https://review.whamcloud.com/45871&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45871&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15112&quot; title=&quot;attempt to register an OST with duplicated index should fail but it does not&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15112&quot;&gt;&lt;del&gt;LU-15112&lt;/del&gt;&lt;/a&gt; ptlrpc: make rq_replied flag always correct&lt;/tt&gt;&quot; included in 2.14.57.&lt;/p&gt;</comment>
                            <comment id="397692" author="gerrit" created="Wed, 20 Dec 2023 19:42:05 +0000"  >&lt;p&gt;&quot;Timothy Day &amp;lt;timday@amazon.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/53518&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/53518&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17240&quot; title=&quot;change test-framework to format and mount targets in parallel &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17240&quot;&gt;LU-17240&lt;/a&gt; tests: format and mount targets in parallel&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: e2ea758f0a11cef20621759746750ac92de418af&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="65602">LU-14928</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="24409">LU-4966</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i03zwn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>