<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:22:36 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15941] sanity test_398b: timeouts with ZFS </title>
                <link>https://jira.whamcloud.com/browse/LU-15941</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;This issue was created by maloo for Alex Zhuravlev &amp;lt;bzzz@whamcloud.com&amp;gt;&lt;/p&gt;

&lt;p&gt;This issue relates to the following test suite run: &lt;a href=&quot;https://testing.whamcloud.com/test_sets/097393a8-5380-4d65-af83-5d44c963ce88&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/097393a8-5380-4d65-af83-5d44c963ce88&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;test_398b failed with the following error:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Timeout occurred after 326 minutes, last suite running was sanity
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;



&lt;p&gt;this started on June 10, after recent landing wave.&lt;br/&gt;
I suspect two patches:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;c45b8a92a3 2022-05-11 | &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15583&quot; title=&quot;Update ZFS version to 2.1.2&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15583&quot;&gt;&lt;del&gt;LU-15583&lt;/del&gt;&lt;/a&gt; build: Update ZFS version to 2.1.2 &lt;span class=&quot;error&quot;&gt;&amp;#91;Jian Yu&amp;#93;&lt;/span&gt;&lt;/li&gt;
	&lt;li&gt;b4880f3758 2021-07-15 | &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15483&quot; title=&quot;Minor DIO test improvements&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15483&quot;&gt;LU-15483&lt;/a&gt; tests: Improve test 398b &lt;span class=&quot;error&quot;&gt;&amp;#91;Patrick Farrell&amp;#93;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;







&lt;p&gt;VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV&lt;br/&gt;
sanity test_398b - Timeout occurred after 326 minutes, last suite running was sanity&lt;/p&gt;</description>
                <environment></environment>
        <key id="70734">LU-15941</key>
            <summary>sanity test_398b: timeouts with ZFS </summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="maloo">Maloo</reporter>
                        <labels>
                    </labels>
                <created>Tue, 14 Jun 2022 06:17:41 +0000</created>
                <updated>Wed, 16 Aug 2023 12:03:38 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="339453" author="adilger" created="Fri, 1 Jul 2022 18:31:47 +0000"  >&lt;p&gt;+1 on master: &lt;a href=&quot;https://testing.whamcloud.com/test_sets/34d33d79-d068-4eeb-8990-9c8d06669a01&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/34d33d79-d068-4eeb-8990-9c8d06669a01&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is reporting pathetic IOPS of 1, under 8 KB/s. I guess that is contention on the singe HDD on the host, also caused by r-m-w of the larger ZFS blocks?  ALEX, do you think your blocksize patch &lt;a href=&quot;https://review.whamcloud.com/47768&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47768&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15963&quot; title=&quot;sanityn test_56b: OSS OOM with ZFS&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15963&quot;&gt;LU-15963&lt;/a&gt; osd: use contiguous chunk to grow blocksize&lt;/tt&gt;&quot; might help this?&lt;/p&gt;</comment>
                            <comment id="344516" author="bzzz" created="Wed, 24 Aug 2022 15:09:45 +0000"  >&lt;p&gt;I profiled 398b: dt_trans_stop() in ofd_commitrw_write() takes 50 usec with ldiskfs and 512831 with ZFS on average.&lt;br/&gt;
the majority of OST_WRITE were missing OBD_BRW_ASYNC, this is why dt_trans_stop() was taking that long.&lt;br/&gt;
changing max blocksize doesn&apos;t improve the situation significantly, locally at least, but I&apos;m going to try that with AT.&lt;/p&gt;
</comment>
                            <comment id="356449" author="nangelinas" created="Wed, 14 Dec 2022 16:59:16 +0000"  >&lt;p&gt;+1 on master: &lt;a href=&quot;https://testing.whamcloud.com/test_sets/b7ffdf4c-d214-427b-95e0-379d8c837267&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/b7ffdf4c-d214-427b-95e0-379d8c837267&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="359270" author="nangelinas" created="Tue, 17 Jan 2023 10:15:40 +0000"  >&lt;p&gt;+1 on master: &lt;a href=&quot;https://testing.whamcloud.com/test_sets/cff8066f-91ae-4805-a4e2-ce35545e5bfe&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/cff8066f-91ae-4805-a4e2-ce35545e5bfe&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="368199" author="paf0186" created="Mon, 3 Apr 2023 15:49:25 +0000"  >&lt;p&gt;Alex,&lt;/p&gt;

&lt;p&gt;They are missing &apos;ASYNC&apos; because they should be missing async - This is direct IO, which expects the server to do a sync each time.&#160; This means DIO performance on ZFS is absolutely terrible.&#160; And I think we can&apos;t fix it except by fixing our sync behavior on ZFS, which I understand is a huge project&lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/help_16.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;.&#160; So... yeah.&lt;/p&gt;</comment>
                            <comment id="368201" author="adilger" created="Mon, 3 Apr 2023 16:09:57 +0000"  >&lt;p&gt;IIRC, there are two significant performance issues with ZFS sync writes:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;one of course is the fact that transaction commit has a lot of overhead (4x &#252;berblock sync writes per device, full merkle tree flush each time)&lt;/li&gt;
	&lt;li&gt;the other is that calling &quot;sync&quot; on ZFS does not actually &lt;b&gt;trigger&lt;/b&gt; a transaction commit, it just waits for one to happen by itself. That is why, on average, Alex is reporting a 0.5s commit time, since the commit interval is 1s and half of the time we are close to the transaction already committing.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;For flash devices it would still be possible to commit thousands of times per second, and for HDD devices maybe 10/s, instead of 1/s. This would of course increase load on the storage and CPUs, but what else are they for, and why should both the clients and servers be waiting idle for the 1s ZFS transaction commit?&lt;/p&gt;</comment>
                            <comment id="382647" author="nangelinas" created="Wed, 16 Aug 2023 12:03:38 +0000"  >&lt;p&gt;+1 on master: &lt;a href=&quot;https://testing.whamcloud.com/test_sets/2dda2437-8b99-4e36-b99d-f769947b2f6b&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/2dda2437-8b99-4e36-b99d-f769947b2f6b&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="70842">LU-15963</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i02s27:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>