<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:40:06 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4148] Clients experiencing massive watchdogs in mdtest rmdir</title>
                <link>https://jira.whamcloud.com/browse/LU-4148</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Running mdtest, seeing a performance drop in rmdir. &lt;br/&gt;
All clients appear to be hitting watchdogs, example:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;INFO: task mdtest:7072 blocked &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; more than 120 seconds.
&lt;span class=&quot;code-quote&quot;&gt;&quot;echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs&quot;&lt;/span&gt; disables &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; message.
mdtest        D 0000000000000009     0  7072   7058 0x00000000
 ffff880870771e08 0000000000000082 ffff880871506aa0 ffff880871506aa0
 ffff880871506aa0 000000000000000b ffff880871506aa0 0000001081065d54
 ffff880871507058 ffff880870771fd8 000000000000fb88 ffff880871507058
Call Trace:
 [&amp;lt;ffffffff8118f541&amp;gt;] ? path_put+0x31/0x40
 [&amp;lt;ffffffff8150f78e&amp;gt;] __mutex_lock_slowpath+0x13e/0x180
 [&amp;lt;ffffffff8150f62b&amp;gt;] mutex_lock+0x2b/0x50
 [&amp;lt;ffffffff81192367&amp;gt;] do_rmdir+0xb7/0x120
 [&amp;lt;ffffffff8100c535&amp;gt;] ? math_state_restore+0x45/0x60
 [&amp;lt;ffffffff81192426&amp;gt;] sys_rmdir+0x16/0x20
 [&amp;lt;ffffffff8100b072&amp;gt;] system_call_fastpath+0x16/0x1b
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;No errors on MDS&lt;/p&gt;</description>
                <environment>Hyperion/LLNL</environment>
        <key id="21651">LU-4148</key>
            <summary>Clients experiencing massive watchdogs in mdtest rmdir</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="cliffw">Cliff White</reporter>
                        <labels>
                    </labels>
                <created>Fri, 25 Oct 2013 21:04:18 +0000</created>
                <updated>Sat, 9 Oct 2021 06:18:08 +0000</updated>
                            <resolved>Sat, 9 Oct 2021 06:18:08 +0000</resolved>
                                    <version>Lustre 2.5.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="70016" author="green" created="Mon, 28 Oct 2013 16:04:58 +0000"  >&lt;p&gt;I guess in these cases it would be very helpful to have a list of all processes with stacktraces to see where does this happen, on client or if a server thread got wedged (less likely because of no errors on MDTs I guess).&lt;/p&gt;

&lt;p&gt;a crashdump from such a client might be helpful too. I assume no other errors in the client logs?&lt;/p&gt;</comment>
                            <comment id="70035" author="cliffw" created="Mon, 28 Oct 2013 17:30:29 +0000"  >&lt;p&gt;At this point i am only seeing the watchdogs, will recreate again, maybe with fewer clients. The test does complete eventually, there are no errors causing test aborts.&lt;/p&gt;</comment>
                            <comment id="70070" author="green" created="Mon, 28 Oct 2013 19:24:34 +0000"  >&lt;p&gt;so, mdt is jsut slow apparently.&lt;br/&gt;
Is this happening on share dir delete?&lt;/p&gt;</comment>
                            <comment id="70085" author="pjones" created="Mon, 28 Oct 2013 23:22:37 +0000"  >&lt;p&gt;Lai&lt;/p&gt;

&lt;p&gt;Oleg was wondering if this might be a wide-effect of this patch - &lt;a href=&quot;http://review.whamcloud.com/7257&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7257&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;What do you think? If not, do you have some other idea?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="70096" author="laisiyao" created="Tue, 29 Oct 2013 06:42:25 +0000"  >&lt;p&gt;The backtrace shows the process is waiting on parent i_mutex in do_rmdir(), if this can be reproduced and see which process is holding this lock, it can help analyse the cause.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/7257&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7257&lt;/a&gt; doesn&apos;t look to be a direct cause of this slowness if there are no processes which changes parent directory permission constantly.&lt;/p&gt;</comment>
                            <comment id="70486" author="cliffw" created="Fri, 1 Nov 2013 15:42:31 +0000"  >&lt;p&gt;Dunp of all registers and stacks from a hung client&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="13726" name="iwc101.dump.txt" size="38966" author="cliffw" created="Fri, 1 Nov 2013 15:42:31 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw6wn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>11263</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>