<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:00:30 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6470] SWL tests appear to wedge on mutex, clients are evicted</title>
                <link>https://jira.whamcloud.com/browse/LU-6470</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Running SWL test on Hyperion, multiple clients timeout, eventually are evicted due to lock timeouts. &lt;br/&gt;
Typical client stack:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;INFO: task ior:76875 blocked &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; more than 120 seconds.
      Not tainted 2.6.32-431.29.2.el6.x86_64 #1
&lt;span class=&quot;code-quote&quot;&gt;&quot;echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs&quot;&lt;/span&gt; disables &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; message.
ior           D 0000000000000000     0 76875  76869 0x00000000
 ffff88083f5ddd18 0000000000000082 0000000000000000 ffff8808341ab1d8
 ffff88083f5ddc88 ffffffff81227e9f ffff88083f5ddd68 ffffffff81199045
 ffff880871a9b058 ffff88083f5ddfd8 000000000000fbc8 ffff880871a9b058
Call Trace:
 [&amp;lt;ffffffff81227e9f&amp;gt;] ? security_inode_permission+0x1f/0x30
 [&amp;lt;ffffffff81199045&amp;gt;] ? __link_path_walk+0x145/0x1000
 [&amp;lt;ffffffff8152a5be&amp;gt;] __mutex_lock_slowpath+0x13e/0x180
 [&amp;lt;ffffffff8152a45b&amp;gt;] mutex_lock+0x2b/0x50
 [&amp;lt;ffffffff8119ba76&amp;gt;] do_filp_open+0x2d6/0xd20
 [&amp;lt;ffffffff811bd6b8&amp;gt;] ? do_statfs_native+0x98/0xb0
 [&amp;lt;ffffffff8128f83a&amp;gt;] ? strncpy_from_user+0x4a/0x90
 [&amp;lt;ffffffff811a8b82&amp;gt;] ? alloc_fd+0x92/0x160
 [&amp;lt;ffffffff81185be9&amp;gt;] do_sys_open+0x69/0x140
 [&amp;lt;ffffffff81185d00&amp;gt;] sys_open+0x20/0x30
 [&amp;lt;ffffffff8100b072&amp;gt;] system_call_fastpath+0x16/0x1b
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Server side:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;Apr 16 12:45:55 iws5 kernel: LustreError: 0:0:(ldlm_lockd.c:341:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 192.168.124.165@o2ib  ns: filter-lustre-OST0024_UUID lock: ffff8801eac5e740/0x9c91b8d7046afd8 lrc: 3/0,0 mode: PR/PR res: [0x1d8c6:0x0:0x0].0 rrc: 13 type: EXT [0-&amp;gt;18446744073709551615] (req 29796335616-&amp;gt;29930553343) flags: 0x60000000010020 nid: 192.168.124.165@o2ib remote: 0x42e28ecfaef2b33a expref: 6 pid: 109819 timeout: 4391206162 lvb_type: 0
Apr 16 13:45:23 iws3 kernel: LustreError: 0:0:(ldlm_lockd.c:341:waiting_locks_callback()) ### lock callback timer expired after 100s: evicting client at 192.168.124.165@o2ib  ns: filter-lustre-OST002d_UUID lock: ffff8805ccaf1180/0xc2e6f2e60c6a3a1f lrc: 3/0,0 mode: PR/PR res: [0x28183:0x0:0x0].0 rrc: 10 type: EXT [0-&amp;gt;18446744073709551615] (req 29527900160-&amp;gt;29662117887) flags: 0x60000000010020 nid: 192.168.124.165@o2ib remote: 0x42e28ecfaef2c102 expref: 5 pid: 23687 timeout: 4394807701 lvb_type: 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Maybe related to DDN-56? Easy to reproduce if more data is required &lt;br/&gt;
I dumped the lustre log from a client immediately after an eviction, file attached&lt;/p&gt;</description>
                <environment>Hyperion, 2.7.52 tag - ldiskfs format 200 clients</environment>
        <key id="29533">LU-6470</key>
            <summary>SWL tests appear to wedge on mutex, clients are evicted</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="cliffw">Cliff White</reporter>
                        <labels>
                    </labels>
                <created>Thu, 16 Apr 2015 20:58:30 +0000</created>
                <updated>Sat, 9 Oct 2021 06:44:43 +0000</updated>
                            <resolved>Sat, 9 Oct 2021 06:44:43 +0000</resolved>
                                    <version>Lustre 2.8.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="112279" author="cliffw" created="Thu, 16 Apr 2015 21:02:46 +0000"  >&lt;p&gt;Dmesg from client after eviction. Client was evicted by multiple OSTs&lt;/p&gt;</comment>
                            <comment id="112424" author="green" created="Mon, 20 Apr 2015 18:10:29 +0000"  >&lt;p&gt;the client backtraces are indicating that MDS is stuck doing something.&lt;br/&gt;
So it would be great to get MDS side of the story from MDS logs, would that still be possible?&lt;/p&gt;</comment>
                            <comment id="112425" author="adilger" created="Mon, 20 Apr 2015 18:10:35 +0000"  >&lt;p&gt;Cliff, is this running ZFS on the MDT/OST or ldiskfs?  It wouldn&apos;t be surprising if there is a heavy metadata load on a ZFS MDT, but more surprising if it is ldiskfs.  Could you please fill in the FSTYPE and client count into the Environment.  Is there also racer/tar/dbench running on other clients during IOR?&lt;/p&gt;</comment>
                            <comment id="112429" author="adilger" created="Mon, 20 Apr 2015 18:14:38 +0000"  >&lt;p&gt;Also, getting the stack traces from the client would be useful, since even if a client thread is blocked waiting for the MDS (which is true from all the stack traces shown) it shouldn&apos;t prevent lock callbacks from the OSTs from being processed.&lt;/p&gt;</comment>
                            <comment id="112430" author="cliffw" created="Mon, 20 Apr 2015 18:18:21 +0000"  >&lt;p&gt;The setup has already been torn down, but if I get a repeat i will get client traces. The failure was on ldiskfs, currently testing with ZFS&lt;/p&gt;</comment>
                            <comment id="113075" author="cliffw" created="Tue, 21 Apr 2015 20:25:49 +0000"  >&lt;p&gt;Ah, the stack trace and lustre log dump from the client are already attached - the client is iwc115, see the two attachments. &lt;/p&gt;</comment>
                            <comment id="113078" author="cliffw" created="Tue, 21 Apr 2015 20:33:39 +0000"  >&lt;p&gt;I have re-checked the logs, and the only errors in that time period were from the OSS nodes, no MDS errors.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="17511" name="iwc115.dmesg.txt" size="15928" author="cliffw" created="Thu, 16 Apr 2015 21:02:46 +0000"/>
                            <attachment id="17510" name="iwc115.evict.log.txt.gz" size="3871098" author="cliffw" created="Thu, 16 Apr 2015 20:58:30 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxaxb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>