<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:35:07 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-3579] Performance regression after applying fix for LU-1397</title>
                <link>https://jira.whamcloud.com/browse/LU-3579</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;After applying the patch for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1397&quot; title=&quot;ENOENT on open()&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1397&quot;&gt;&lt;del&gt;LU-1397&lt;/del&gt;&lt;/a&gt;, Fujitsu has seen performance for their benchmark suite decrease by about 7%. This is pushing the system outside of the acceptance range, and so therefore a high priority for us. I am trying to get information about what exactly is going more slowly (what application/file operations), but would it be possible for someone to review the patch to see if there are any areas that could be improved?&lt;/p&gt;

&lt;p&gt;Thanks.&lt;/p&gt;</description>
                <environment></environment>
        <key id="19782">LU-3579</key>
            <summary>Performance regression after applying fix for LU-1397</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="green">Oleg Drokin</assignee>
                                    <reporter username="manish">Manish Patel</reporter>
                        <labels>
                    </labels>
                <created>Thu, 11 Jul 2013 18:09:21 +0000</created>
                <updated>Fri, 14 Nov 2014 14:48:45 +0000</updated>
                            <resolved>Fri, 14 Nov 2014 14:48:45 +0000</resolved>
                                    <version>Lustre 2.1.6</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="62143" author="pjones" created="Thu, 11 Jul 2013 18:28:26 +0000"  >&lt;p&gt;Can we take a step back here? What motivated applying the patch for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1397&quot; title=&quot;ENOENT on open()&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1397&quot;&gt;&lt;del&gt;LU-1397&lt;/del&gt;&lt;/a&gt; in the first place? Reading the notes relating to that, it was deemed unnecessary and abandoned. Are any other patches applied or is this otherwise a vanilla 2.1.6 deployment across the board?&lt;/p&gt;</comment>
                            <comment id="62150" author="kitwestneat" created="Thu, 11 Jul 2013 18:59:43 +0000"  >&lt;p&gt;Well the patch we applied was actually part of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1234&quot; title=&quot;Executing binary stored on Lustre results in &amp;quot; (deleted)&amp;quot; appended to /proc/self/exec&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1234&quot;&gt;&lt;del&gt;LU-1234&lt;/del&gt;&lt;/a&gt;. I&apos;m rereading the bugs and perhaps I misread it originally. Basically the issue is that the applications are getting ENOENT errors on 75% of the runs. I thought that the patch in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1234&quot; title=&quot;Executing binary stored on Lustre results in &amp;quot; (deleted)&amp;quot; appended to /proc/self/exec&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1234&quot;&gt;&lt;del&gt;LU-1234&lt;/del&gt;&lt;/a&gt; fixed the ENOENTs in 1397, but rereading it, it sounds like it was actually introduced by an earlier version of the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1234&quot; title=&quot;Executing binary stored on Lustre results in &amp;quot; (deleted)&amp;quot; appended to /proc/self/exec&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1234&quot;&gt;&lt;del&gt;LU-1234&lt;/del&gt;&lt;/a&gt; patch? It&apos;s interesting to note however that we didn&apos;t run into the ENOENT issue after applying it, though it could have been luck.&lt;/p&gt;

&lt;p&gt;Oh I just realized I didn&apos;t say the version history in my initial post. They are running 2.1.3 servers and were running 2.1.3 clients. After running into the ENOENT issue, we upgraded them to 2.1.6 clients and that&apos;s when we first saw the regression. We then built a version of 2.1.3 only with:&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/2400/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/2400/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The regression was still in that version as well. We are rerunning the test with 2.1.3 clients just to confirm that nothing else has changed, and the regression is definitely in the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1234&quot; title=&quot;Executing binary stored on Lustre results in &amp;quot; (deleted)&amp;quot; appended to /proc/self/exec&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1234&quot;&gt;&lt;del&gt;LU-1234&lt;/del&gt;&lt;/a&gt; patch. &lt;/p&gt;</comment>
                            <comment id="62289" author="kitwestneat" created="Mon, 15 Jul 2013 14:39:20 +0000"  >&lt;p&gt;Going back to the original, ENOENT issue. It looks like files are not being created at all. Or if they are being created they are then completely disappearing. Here is the description from the customer:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It appears data is going missing during a Quantum Espresso job run:&lt;/p&gt;

&lt;p&gt;An example of the output from a Quantum Espresso failure is attached.&lt;/p&gt;

&lt;p&gt;Whilst the QuantumEspresso application runs it opens a scratch file for each MPI process. Occasionally when it tries to reopen one of these files it&#8217;s not there and it crashes with an error message like:&lt;/p&gt;

&lt;p&gt;    Error in routine seqopn (16):&lt;br/&gt;
    error opening ./ausurf.igk2&lt;/p&gt;

&lt;p&gt;All input, output files for the run are in:&lt;br/&gt;
/scratch/nick.wilson/parallel_benchmarks/AUSURF112.19274.1&lt;/p&gt;

&lt;p&gt;We believe this to be a Lustre issue as when it&#8217;s run against local disk for scratch space there are no failures, only when we use Lustre for scratch space does this happen. &lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Do you have any opinions on what debug settings to run in order to get more information? I was thinking that rpctrace might be the best to start with. Any advice on it?&lt;/p&gt;

&lt;p&gt;Thanks.&lt;/p&gt;</comment>
                            <comment id="62326" author="green" created="Mon, 15 Jul 2013 20:56:05 +0000"  >&lt;p&gt;if the initial open-create succeeded, the file definitely was created and opened.&lt;br/&gt;
Now if it&apos;s not there later, it might have been just deleted? Can you monitor delete stats on MDT to confirm this?&lt;/p&gt;

&lt;p&gt;You can also enable rpctrace, but that will only show you what sort of RPCs were sent around, not names of files or some such.&lt;br/&gt;
On a client you need to also do vfstrace to see the names of files. On MDT you need to enable &quot;inode&quot; tracer to see those details, but that would slow it down quite a bit I imagine.&lt;/p&gt;</comment>
                            <comment id="99137" author="manish" created="Fri, 14 Nov 2014 04:11:40 +0000"  >&lt;p&gt;Hi,&lt;/p&gt;

&lt;p&gt;This ticket can be closed, since we do not have any more information in this case.&lt;/p&gt;

&lt;p&gt;Thank You,&lt;br/&gt;
                   Manish&lt;/p&gt;</comment>
                            <comment id="99167" author="pjones" created="Fri, 14 Nov 2014 14:48:45 +0000"  >&lt;p&gt;ok thanks Manish&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="14395">LU-1397</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvv6v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9057</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>