<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:17:09 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1496] Client evicted frequently due to lock callback timer expiration</title>
                <link>https://jira.whamcloud.com/browse/LU-1496</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Our customer is seeing client eviction due to lock callback timer expiration relatively frequently.  &lt;br/&gt;
The client is not always same, but it occurred 3 times on Jun 3rd.  As far as the customer checked &lt;br/&gt;
the network, there is no error reported.  &lt;/p&gt;

&lt;p&gt;&amp;lt;&amp;lt; OSS &amp;gt;&amp;gt; &lt;br/&gt;
2012/06/03 17:32:06 kern.err@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;261289.824573&amp;#93;&lt;/span&gt; LustreError: 0:0:(ldlm_lockd.c:305:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 172.17.11.9@o2ib  ns: filter-data-OST0007_UUID lock: ffff8100debd3000/0x2babc16b9ceef3a2 lrc: 3/0,0 mode: PW/PW res: 345385/0 rrc: 8 type: EXT &lt;span class=&quot;error&quot;&gt;&amp;#91;32768-&amp;gt;159743&amp;#93;&lt;/span&gt; (req 32768-&amp;gt;36863) flags: 0x20 remote: 0xd4d1e1a63a1900b8 expref: 14 pid: 2520 timeout 4556489496&lt;br/&gt;
2012/06/03 17:32:08 kern.err@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;261292.335250&amp;#93;&lt;/span&gt; LustreError: 26916:0:(ldlm_lib.c:1914:target_send_reply_msg()) @@@ processing error (&lt;del&gt;107)  req@ffff81049eabb000 x1403470492409861/t0 o13&lt;/del&gt;&amp;gt;&amp;lt;?&amp;gt;@&amp;lt;?&amp;gt;:0/0 lens 192/0 e 0 to 0 dl 1338712334 ref 1 fl Interpret:/0/0 rc -107/0&lt;br/&gt;
2012/06/03 17:32:22 kern.err@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;261306.299203&amp;#93;&lt;/span&gt; LustreError: 2435:0:(ldlm_lib.c:1914:target_send_reply_msg()) @@@ processing error (&lt;del&gt;114)  req@ffff8102ac2e3c00 x1403470492409878/t0 o8&lt;/del&gt;&amp;gt;&amp;lt;?&amp;gt;@&amp;lt;?&amp;gt;:0/0 lens 368/264 e 0 to 0 dl 1338712442 ref 1 fl Interpret:/0/0 rc -114/0&lt;/p&gt;

&lt;p&gt;&amp;lt;&amp;lt; client &amp;gt;&amp;gt; &lt;br/&gt;
2012/06/03 17:32:08 kern.err@cnode009 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: kernel: LustreError: 11-0: an error occurred while communicating with 172.17.13.36@o2ib. The ost_statfs operation failed with -107&lt;br/&gt;
2012/06/03 17:32:08 kern.warning@cnode009 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: kernel: Lustre: data-OST0007-osc-ffff81063bc60800: Connection to service data-OST0007 via nid 172.17.13.36@o2ib was lost; in progress operations using this service will wait for recovery to complete.&lt;br/&gt;
2012/06/03 17:32:14 kern.warning@cnode009 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: kernel: Lustre: 3960:0:(client.c:1482:ptlrpc_expire_one_request()) @@@ Request x1403470492409862 sent from data-OST0007-osc-ffff81063bc60800 to NID 172.17.13.36@o2ib 6s ago has timed out (6s prior to deadline).&lt;br/&gt;
2012/06/03 17:32:14 kern.warning@cnode009 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: kernel:   req@ffff810256624800 x1403470492409862/t0 o8-&amp;gt;data-OST0007_UUID@172.17.13.36@o2ib:28/4 lens 368/584 e 0 to 1 dl 1338712334 ref 1 fl Rpc:N/0/0 rc 0/0&lt;/p&gt;
</description>
                <environment></environment>
        <key id="14807">LU-1496</key>
            <summary>Client evicted frequently due to lock callback timer expiration</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="mnishizawa">Mitsuhiro Nishizawa</reporter>
                        <labels>
                    </labels>
                <created>Fri, 8 Jun 2012 06:29:39 +0000</created>
                <updated>Tue, 8 Oct 2013 22:34:43 +0000</updated>
                            <resolved>Mon, 27 Aug 2012 15:53:36 +0000</resolved>
                                    <version>Lustre 1.8.x (1.8.0 - 1.8.5)</version>
                                    <fixVersion>Lustre 1.8.9</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="40261" author="mnishizawa" created="Fri, 8 Jun 2012 06:32:39 +0000"  >&lt;p&gt;Uploaded log file to the FTP server: /uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1496&quot; title=&quot;Client evicted frequently due to lock callback timer expiration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1496&quot;&gt;&lt;del&gt;LU-1496&lt;/del&gt;&lt;/a&gt;/20120604.tar.gz&lt;/p&gt;

&lt;p&gt;Thanks, &lt;/p&gt;</comment>
                            <comment id="40282" author="pjones" created="Fri, 8 Jun 2012 13:06:37 +0000"  >&lt;p&gt;Bobijam&lt;/p&gt;

&lt;p&gt;Could you please advise on this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="40324" author="bobijam" created="Sun, 10 Jun 2012 23:33:17 +0000"  >&lt;p&gt;from uploaded logs, it looks like ost0007 node had some congestions which caused client evictions, can you check it (disk, network interface, etc) and upload lustre debug logs of it here?&lt;/p&gt;</comment>
                            <comment id="40551" author="mnishizawa" created="Thu, 14 Jun 2012 05:21:02 +0000"  >&lt;p&gt;We asked the customer to check the network and capture debug log when eviction occur next time.  &lt;br/&gt;
We will upload once we receive it.  Thanks, &lt;/p&gt;</comment>
                            <comment id="40889" author="mnishizawa" created="Tue, 19 Jun 2012 22:11:48 +0000"  >&lt;p&gt;The issue occurred again and we captured debug log.  We uploaded it to the FTP server: /uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1496&quot; title=&quot;Client evicted frequently due to lock callback timer expiration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1496&quot;&gt;&lt;del&gt;LU-1496&lt;/del&gt;&lt;/a&gt;/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1496&quot; title=&quot;Client evicted frequently due to lock callback timer expiration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1496&quot;&gt;&lt;del&gt;LU-1496&lt;/del&gt;&lt;/a&gt;_20120620.tar.gz&lt;br/&gt;
$ tar tzf &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1496&quot; title=&quot;Client evicted frequently due to lock callback timer expiration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1496&quot;&gt;&lt;del&gt;LU-1496&lt;/del&gt;&lt;/a&gt;_20120620.tar.gz &lt;br/&gt;
debug_kernel_20120619_235205 ... debug log on the OSS&lt;br/&gt;
debug_kernel_cnode050        ... debug log on the evicted client&lt;br/&gt;
log-20120620/&lt;br/&gt;
log-20120620/cnode050_kernel.txt ... messages on the evicted client&lt;br/&gt;
log-20120620/oss4_kernel.txt ... messages on the OSS&lt;/p&gt;

&lt;p&gt;2012/06/19 23:52:05 kern.err@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1662880.375982&amp;#93;&lt;/span&gt; LustreError: 0:0:(ldlm_lockd.c:305:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 172.17.11.50@o2ib  ns: filter-data-OST0007_UUID lock: ffff8101a5e2fa00/0x2babc16ba243f737 lrc: 3/0,0 mode: PW/PW res: 430494/0 rrc: 8 type: EXT &lt;span class=&quot;error&quot;&gt;&amp;#91;73728-&amp;gt;159743&amp;#93;&lt;/span&gt; (req 73728-&amp;gt;77823) flags: 0x20 remote: 0x225685b920b04720 expref: 8 pid: 2438 timeout 5961707167&lt;/p&gt;</comment>
                            <comment id="40897" author="bobijam" created="Wed, 20 Jun 2012 00:32:06 +0000"  >
&lt;p&gt;The OSS4 is heavy loaded, and many clients are compete writing to same files. The write request from cnode050 to OST0007 (on OSS4) was delayed too much for the heavy load and got evicted.&lt;/p&gt;

&lt;div class=&quot;panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;panelHeader&quot; style=&quot;border-bottom-width: 1px;&quot;&gt;&lt;b&gt;OSS4&apos;s heavy load&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;panelContent&quot;&gt;
&lt;p&gt;2012/06/19 13:36:27 kern.warning@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1626037.218357&amp;#93;&lt;/span&gt; Lustre: data-OST0007: slow quota init 150s due to heavy IO load&lt;br/&gt;
2012/06/19 13:36:27 kern.warning@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1626037.218364&amp;#93;&lt;/span&gt; Lustre: Skipped 4 previous similar messages&lt;br/&gt;
2012/06/19 13:36:27 kern.warning@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1626037.218535&amp;#93;&lt;/span&gt; Lustre: data-OST0007: slow quota init 150s due to heavy IO load&lt;br/&gt;
2012/06/19 13:36:27 kern.warning@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1626037.218541&amp;#93;&lt;/span&gt; Lustre: Skipped 1 previous similar message&lt;br/&gt;
2012/06/19 13:36:27 kern.warning@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1626037.224625&amp;#93;&lt;/span&gt; Lustre: data-OST0007: slow direct_io 166s due to heavy IO load&lt;br/&gt;
2012/06/19 21:42:41 kern.warning@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1655136.964476&amp;#93;&lt;/span&gt; Lustre: data-OST0006: slow quota init 101s due to heavy IO load&lt;br/&gt;
2012/06/19 21:42:41 kern.warning@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1655136.964479&amp;#93;&lt;/span&gt; Lustre: data-OST0006: slow quota init 80s due to heavy IO load&lt;br/&gt;
2012/06/19 21:42:41 kern.warning@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1655136.964485&amp;#93;&lt;/span&gt; Lustre: Skipped 48 previous similar messages&lt;br/&gt;
2012/06/19 21:42:41 kern.warning@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1655136.964489&amp;#93;&lt;/span&gt; Lustre: Skipped 48 previous similar messages&lt;br/&gt;
2012/06/19 21:42:41 kern.warning@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1655136.965584&amp;#93;&lt;/span&gt; Lustre: data-OST0006: slow direct_io 131s due to heavy IO load&lt;br/&gt;
2012/06/19 21:42:41 kern.info@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1655136.966417&amp;#93;&lt;/span&gt; Lustre: data-OST0006: slow quota init 31s due to heavy IO load&lt;br/&gt;
2012/06/19 21:42:41 kern.info@oss4 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1655136.966421&amp;#93;&lt;/span&gt; Lustre: Skipped 7 previous similar messages&lt;/p&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;panelHeader&quot; style=&quot;border-bottom-width: 1px;&quot;&gt;&lt;b&gt;cnode050&apos;s write request got evicted due to lock&apos;s timeout&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;panelContent&quot;&gt;
&lt;p&gt;00000100:00000200:23:1340117424.398493:0:4625:0:(service.c:1046:ptlrpc_hpreq_reorder_nolock()) @@@ high priority req  req@ffff8101b6316800 x1403470546287529/t0 o4-&amp;gt;2208c67a-8f53-db44-9cbd-e65dba8f887c@NET_0x50000ac110b32_UUID:0/0 lens 448/0 e 0 to 0 dl 1340117467 ref 1 fl New:H/0/0 rc 0/0&lt;br/&gt;
00000100:00100000:23:1340117424.398534:0:4625:0:(service.c:1369:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io_171:2208c67a-8f53-db44-9cbd-e65dba8f887c+9:3937:x1403470546287529:12345-172.17.11.50@o2ib:4&lt;/p&gt;

&lt;p&gt;&lt;font color=&quot;red&quot;&gt; got request on 1340117424 and start to handle &lt;/font&gt;&lt;/p&gt;

&lt;p&gt;00000100:00001000:7:1340117462.705473:0:2587:0:(service.c:762:ptlrpc_at_send_early_reply()) @@@ sending early reply (deadline +5s, margin &lt;del&gt;63s) for 68+30  req@ffff8101b6316800 x1403470546287529/t0 o4&lt;/del&gt;&amp;gt;2208c67a-8f53-db44-9cbd-e65dba8f887c@NET_0x50000ac110b32_UUID:0/0 lens 448/416 e 0 to 0 dl 1340117467 ref 2 fl Interpret:H/0/0 rc 0/0&lt;br/&gt;
00000100:00001000:7:1340117462.705480:0:2587:0:(import.c:1468:at_measured()) add 68 to ffff8102ee7ab860 time=60 v=68 (68 1 31 1)&lt;/p&gt;

&lt;p&gt;&lt;font color=&quot;red&quot;&gt; 38 seconds passed, asked client give OST extra 68 seconds &lt;/font&gt;&lt;/p&gt;

&lt;p&gt;00010000:00020000:23:1340117525.229707:0:0:0:(ldlm_lockd.c:305:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 172.17.11.50@o2ib  ns: filter-data-OST0007_UUID lock: ffff8101a5e2fa00/0x2babc16ba243f737 lrc: 3/0,0 mode: PW/PW res: 430494/0 rrc: 8 type: EXT &lt;span class=&quot;error&quot;&gt;&amp;#91;73728-&amp;gt;159743&amp;#93;&lt;/span&gt; (req 73728-&amp;gt;77823) flags: 0x20 remote: 0x225685b920b04720 expref: 8 pid: 2438 timeout 5961707167&lt;/p&gt;

&lt;p&gt;&lt;font color=&quot;red&quot;&gt; the extent lock protecting the write has timed out and evicted the client &lt;/font&gt;&lt;/p&gt;
&lt;/div&gt;&lt;/div&gt; </comment>
                            <comment id="40910" author="bobijam" created="Wed, 20 Jun 2012 03:14:50 +0000"  >&lt;p&gt;I think &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-918&quot; title=&quot;ensure that BRW requests prevent lock timeout&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-918&quot;&gt;&lt;del&gt;LU-918&lt;/del&gt;&lt;/a&gt; and &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-874&quot; title=&quot;Client eviction on lock callback timeout &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-874&quot;&gt;&lt;del&gt;LU-874&lt;/del&gt;&lt;/a&gt; addressed this issue. &lt;/p&gt;</comment>
                            <comment id="40913" author="mnishizawa" created="Wed, 20 Jun 2012 04:42:44 +0000"  >&lt;p&gt;The customer&apos;s lustre version is 1.8.4.  Is the issue in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-874&quot; title=&quot;Client eviction on lock callback timeout &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-874&quot;&gt;&lt;del&gt;LU-874&lt;/del&gt;&lt;/a&gt; applicable to 1.8.4?  &lt;br/&gt;
Can the patched released there be applied?  &lt;/p&gt;</comment>
                            <comment id="40964" author="bobijam" created="Wed, 20 Jun 2012 21:01:26 +0000"  >&lt;p&gt;not directly, we need a patch port for 1.8.x&lt;/p&gt;</comment>
                            <comment id="40968" author="bobijam" created="Thu, 21 Jun 2012 00:49:09 +0000"  >&lt;p&gt;port patch tracking at &lt;a href=&quot;http://review.whamcloud.com/3157&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/3157&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="40990" author="mnishizawa" created="Thu, 21 Jun 2012 10:44:00 +0000"  >&lt;p&gt;Thanks, Zhenyu.  We will discuss if we can apply the patch.  &lt;/p&gt;

&lt;p&gt;Today, the customer encoutered another instance of the issue.  &lt;br/&gt;
2012/06/21 01:04:29 kern.err@oss2 kernel&lt;span class=&quot;error&quot;&gt;&amp;#91;-&amp;#93;&lt;/span&gt;: &lt;span class=&quot;error&quot;&gt;&amp;#91;1753382.589040&amp;#93;&lt;/span&gt; LustreError: 0:0:(ldlm_lockd.c:305:waiting_locks_callback()) ### lock callback timer expired after 100s: evicting client at 172.17.11.29@o2ib  ns: filter-data-OST0003_UUID lock: ffff8101127e8c00/0xb2893ade320a8743 lrc: 3/0,0 mode: PW/PW res: 431401/0 rrc: 8 type: EXT &lt;span class=&quot;error&quot;&gt;&amp;#91;86016-&amp;gt;155647&amp;#93;&lt;/span&gt; (req 86016-&amp;gt;90111) flags: 0x20 remote: 0xa4f5ddef5f4b10d8 expref: 10 pid: 2462 timeout 6052442992&lt;/p&gt;

&lt;p&gt;We uploaded the debug log to the FTP server:/uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1496&quot; title=&quot;Client evicted frequently due to lock callback timer expiration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1496&quot;&gt;&lt;del&gt;LU-1496&lt;/del&gt;&lt;/a&gt;/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1496&quot; title=&quot;Client evicted frequently due to lock callback timer expiration&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1496&quot;&gt;&lt;del&gt;LU-1496&lt;/del&gt;&lt;/a&gt;_20120621.tar.gz&lt;br/&gt;
debug_kernel_20120621_010429.gz  ... oss2 debug log&lt;br/&gt;
debug_kernel_cnode029.gz         ... client debug log&lt;br/&gt;
log-20120621/cnode029_kernel.txt ... client messages &lt;br/&gt;
log-20120621/oss2_kernel.txt     ... oss2 messages &lt;/p&gt;

&lt;p&gt;It looks like this is again due to high I/O load on the OSS.  &lt;br/&gt;
Would you please check if this is the same cause?  &lt;/p&gt;</comment>
                            <comment id="41551" author="bobijam" created="Fri, 6 Jul 2012 02:59:02 +0000"  >&lt;p&gt;Sorry for the late response, yes, I think the phenomenon is the same.&lt;/p&gt;</comment>
                            <comment id="41608" author="mnishizawa" created="Mon, 9 Jul 2012 10:58:40 +0000"  >&lt;p&gt;Thank for checking the log.  The customer is currently considering to upgrade lustre version with the patch applied.  &lt;br/&gt;
Please close this ticket.  Thank you.  &lt;/p&gt;</comment>
                            <comment id="43827" author="pjones" created="Mon, 27 Aug 2012 15:53:36 +0000"  >&lt;p&gt;Landed to b1_8 and already landed to master under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-874&quot; title=&quot;Client eviction on lock callback timeout &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-874&quot;&gt;&lt;del&gt;LU-874&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="68645" author="mmcdade" created="Tue, 8 Oct 2013 22:34:43 +0000"  >&lt;p&gt;I am running into this problem while running lustre 1.8.8.  Is there a simple patch or do I have to upgrade to 1.8.9?  Have already increased the osc.&lt;b&gt;.max_rpcs_in_flight=32, turned off checksums, and  set osc.&lt;/b&gt;.max_dirty_mb=512.  This helped with some evictions.  However, I tend to think the timeout=100 still needs to be increased even with adaptive timeouts.  Notice the 100sec timeout below when we have at_min=0 and at_max=800.  Interesting expiration time since timeout=100.  Could also be related to an actual network socket timeout.  Please help.&lt;/p&gt;

&lt;p&gt;Melinda&lt;/p&gt;

&lt;p&gt;LustreError: 0:0:(ldlm_lockd.c:313:waiting_locks_callback()) ### lock callback timer expired after 100s: evicting client at 34.239.17.202@tcp  ns: filter-dlustre-OST0002_UUID lock: ffff8102636cde00/0x392a313d071a080e lrc: 3/0,0 mode: PW/PW res: 961326/0 rrc: 2 type: EXT &lt;span class=&quot;error&quot;&gt;&amp;#91;0-&amp;gt;18446744073709551615&amp;#93;&lt;/span&gt; (req 71299072-&amp;gt;71303167) flags: 0x20 remote: 0xa1584afedda7b6c expref: 197 pid: 5369 timeout 10277973712&lt;br/&gt;
LustreError: 0:0:(ldlm_lockd.c:313:waiting_locks_callback()) Skipped 39 previous similar messages&lt;br/&gt;
LustreError: 4057:0:(ldlm_lib.c:1919:target_send_reply_msg()) &lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvgvz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6385</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>