<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:21:46 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2030] LDLM timeout resulting in client eviction and truncate system call to fail with EINTR</title>
                <link>https://jira.whamcloud.com/browse/LU-2030</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I see errors reported from the truncate system call when running SIMUL on 2 nodes with 16 processes each (32 processes total).&lt;/p&gt;

&lt;p&gt;The reproducer is:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;$ salloc -N2 -n32
$ srun -- /g/g0/surya1/dev/simul-1.14/simul -d /p/lcrater2/surya1/simul-test -N 635
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Specifically, I see truncate return EINTR when running the test under strace:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;truncate(&quot;/p/lcrater2/surya1/simul-test/simul_truncate.0&quot;, 1024) = -1 EINTR (Interrupted system call)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The console logs from the clients and servers from a case where it hit this is below.&lt;/p&gt;

&lt;p&gt;Test:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;hype355@root:srun -- /g/g0/surya1/dev/simul-1.14/simul -d /p/lcrater2/surya1/simul-test -N 635 -f14 -l15
Simul is running with 32 process(es)
13:14:38: Set iteration 0
13:14:38: Running test #14(iter 0): creat, shared mode.
13:14:38: Running test #15(iter 0): truncate, shared mode.
13:18:00: Process 3(hype214): FAILED in simul_truncate, truncate failed: Interrupted system call
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Client console messages:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# hype355 /root &amp;gt; tail -f /var/log/conman/console.hype21{3,4}
==&amp;gt; /var/log/conman/console.hype213 &amp;lt;==
2012-09-25 13:16:19 Lustre: lc2-OST000c-osc-ffff880833b85400: Connection to lc2-OST000c (at 10.1.1.49@o2ib9) was lost; in progress operations using this service will wait for recovery to complete
2012-09-25 13:16:19 LustreError: 107596:0:(ldlm_request.c:1179:ldlm_cli_cancel_req()) Got rc -107 from cancel RPC: canceling anyway
2012-09-25 13:16:19 LustreError: 107596:0:(ldlm_request.c:1807:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -107
2012-09-25 13:16:19 LustreError: 167-0: This client was evicted by lc2-OST000c; in progress operations using this service will fail.
2012-09-25 13:16:19 LustreError: 107628:0:(ldlm_resource.c:749:ldlm_resource_complain()) Namespace lc2-OST000c-osc-ffff880833b85400 resource refcount nonzero (1) after lock cleanup; forcing cleanup.
2012-09-25 13:16:19 LustreError: 107628:0:(ldlm_resource.c:755:ldlm_resource_complain()) Resource: ffff880748182d40 (167630808/0/0/0) (rc: 1)
2012-09-25 13:16:19 Lustre: lc2-OST000c-osc-ffff880833b85400: Connection restored to lc2-OST000c (at 10.1.1.49@o2ib9)

==&amp;gt; /var/log/conman/console.hype214 &amp;lt;==
2012-09-25 13:18:00 Lustre: lc2-OST000c-osc-ffff880831ce8800: Connection to lc2-OST000c (at 10.1.1.49@o2ib9) was lost; in progress operations using this service will wait for recovery to complete
2012-09-25 13:18:00 LustreError: 75280:0:(ldlm_request.c:1179:ldlm_cli_cancel_req()) Got rc -107 from cancel RPC: canceling anyway
2012-09-25 13:18:00 LustreError: 75280:0:(ldlm_request.c:1807:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -107
2012-09-25 13:18:00 LustreError: 167-0: This client was evicted by lc2-OST000c; in progress operations using this service will fail.
2012-09-25 13:18:00 Lustre: lc2-OST000c-osc-ffff880831ce8800: Connection restored to lc2-OST000c (at 10.1.1.49@o2ib9)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Server console messages:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;&amp;gt; $ tail /var/log/conman/console.zwicky49
&amp;gt; ==&amp;gt; /var/log/conman/console.zwicky49 &amp;lt;==
&amp;gt; 2012-09-25 13:16:19 LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 100s: evicting client at 192.168.121.14@o2ib2  ns: filter-lc2-OST000c_UUID lock: ffff88028aee6d80/0x8691e72961eb7429 lrc: 3/0,0 mode: PW/PW res: 167630808/0 rrc: 2 type: EXT [0-&amp;gt;18446744073709551615] (req 0-&amp;gt;18446744073709551615) flags: 0x20 remote: 0xf19c08e9d42c3468 expref: 4 pid: 4358 timeout 7757255590
&amp;gt; 2012-09-25 13:16:19 LustreError: 21861:0:(ldlm_lockd.c:2074:ldlm_cancel_handler()) ldlm_cancel from 192.168.121.14@o2ib2 arrived at 1348604178 with bad export cookie 9696785637627868352
&amp;gt; 2012-09-25 13:16:19 Lustre: 24547:0:(ldlm_lib.c:933:target_handle_connect()) lc2-OST000c: connection from 7f3b19c3-f206-6f48-4758-3716d5199ca1@192.168.121.14@o2ib2 t0 exp (null) cur 1348604179 last 0
&amp;gt; 2012-09-25 13:17:59 LustreError: 0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 100s: evicting client at 192.168.121.15@o2ib2  ns: filter-lc2-OST000c_UUID lock: ffff8801d0235b40/0x8691e72961eb744c lrc: 3/0,0 mode: PW/PW res: 167630808/0 rrc: 2 type: EXT [0-&amp;gt;18446744073709551615] (req 0-&amp;gt;18446744073709551615) flags: 0x20 remote: 0xbd300c41b4abfe4a expref: 4 pid: 24547 timeout 7757356173
&amp;gt; 2012-09-25 13:18:00 LustreError: 21861:0:(ldlm_lockd.c:2074:ldlm_cancel_handler()) ldlm_cancel from 192.168.121.15@o2ib2 arrived at 1348604279 with bad export cookie 9696785637609291136
&amp;gt; 2012-09-25 13:18:00 Lustre: 25184:0:(ldlm_lib.c:933:target_handle_connect()) lc2-OST000c: connection from e0aa3722-987c-e30c-5c60-102f582dddae@192.168.121.15@o2ib2 t0 exp (null) cur 1348604280 last 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The client version of lustre on the nodes:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;hype355@root:pdsh -w hype[213-214] &apos;rpm -qa | grep lustre | sort&apos; | dshbak -c
----------------
hype[213-214]
----------------
lustre-2.1.2-4chaos_2.6.32_220.23.1.1chaos.ch5.x86_64.x86_64
lustre-modules-2.1.2-4chaos_2.6.32_220.23.1.1chaos.ch5.x86_64.x86_64
lustre-tools-llnl-1.4-2.ch5.x86_64
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The server version of lustre:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;$ pdsh -w zwicky[48-49] &apos;rpm -qa | grep lustre | sort&apos; | dshbak -c
----------------
zwicky[48-49]
----------------
lustre-2.1.2-3chaos_2.6.32_220.23.1.1chaos.ch5.x86_64.x86_64
lustre-modules-2.1.2-3chaos_2.6.32_220.23.1.1chaos.ch5.x86_64.x86_64
lustre-tools-llnl-1.4-2.ch5.x86_64
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>Lustre Client: 2.1.2-4chaos&lt;br/&gt;
Lustre Server: 2.1.2-3chaos</environment>
        <key id="16123">LU-2030</key>
            <summary>LDLM timeout resulting in client eviction and truncate system call to fail with EINTR</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="2">Won&apos;t Fix</resolution>
                                        <assignee username="hongchao.zhang">Hongchao Zhang</assignee>
                                    <reporter username="prakash">Prakash Surya</reporter>
                        <labels>
                            <label>llnl</label>
                    </labels>
                <created>Tue, 25 Sep 2012 17:03:10 +0000</created>
                <updated>Sat, 23 Jan 2016 01:15:39 +0000</updated>
                            <resolved>Sat, 23 Jan 2016 01:15:39 +0000</resolved>
                                    <version>Lustre 2.1.2</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>10</watches>
                                                                            <comments>
                            <comment id="45540" author="pjones" created="Tue, 25 Sep 2012 18:49:25 +0000"  >&lt;p&gt;Hongchao&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="45567" author="hongchao.zhang" created="Wed, 26 Sep 2012 10:45:03 +0000"  >&lt;p&gt;Is the debug log available? there was an extent lock held by client much longer time than server permitted and was evicted, from the debug log,&lt;br/&gt;
we could find which lock was the culprit and help to trace where is the real problem.&lt;/p&gt;</comment>
                            <comment id="45580" author="prakash" created="Wed, 26 Sep 2012 13:01:34 +0000"  >&lt;p&gt;Here is a Lustre Log file with &apos;+dlmtrace&apos; from the server which had the lock expire. &lt;/p&gt;</comment>
                            <comment id="45583" author="prakash" created="Wed, 26 Sep 2012 13:27:20 +0000"  >&lt;p&gt;Here are logs from the server and the client with &apos;+dlmtrace&apos;.&lt;/p&gt;

&lt;p&gt;The client also dumped dumped this to it&apos;s console:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2012-09-26 10:08:14 LustreError: 11-0: lc2-OST000c-osc-ffff880831ce8800: Communicating with 10.1.1.49@o2ib9, operation ost_punch failed with -107.
2012-09-26 10:08:14 Lustre: lc2-OST000c-osc-ffff880831ce8800: Connection to lc2-OST000c (at 10.1.1.49@o2ib9) was lost; in progress operations using this service will wait for recovery to complete
2012-09-26 10:08:14 LustreError: 167-0: This client was evicted by lc2-OST000c; in progress operations using this service will fail.
2012-09-26 10:08:14 LustreError: 79024:0:(osc_lock.c:816:osc_ldlm_completion_ast()) lock@ffff88043156ddd8[2 3 0 1 1 00000000] W(2):[0, 18446744073709551615]@[0x1000c0000:0x9fdd7d9:0x0] {
2012-09-26 10:08:14 LustreError: 79024:0:(osc_lock.c:816:osc_ldlm_completion_ast())     lovsub@ffff88043156e4e0: [0 ffff880773391748 W(2):[0, 18446744073709551615]@[0x20097c3a8:0x16a97:0x0]] 
2012-09-26 10:08:14 LustreError: 79024:0:(osc_lock.c:816:osc_ldlm_completion_ast())     osc@ffff88043156fb50: ffff880199befb40 00120002 0xbd300c41b4ac0bfd 3 ffff8803dd5e9068 size: 1024 mtime: 1348679193 atime: 1348679193 ctime: 1348679193 blocks: 8
2012-09-26 10:08:14 LustreError: 79024:0:(osc_lock.c:816:osc_ldlm_completion_ast()) } lock@ffff88043156ddd8
2012-09-26 10:08:14 LustreError: 79024:0:(osc_lock.c:816:osc_ldlm_completion_ast()) dlmlock returned -5
2012-09-26 10:08:14 LustreError: 79024:0:(ldlm_resource.c:749:ldlm_resource_complain()) Namespace lc2-OST000c-osc-ffff880831ce8800 resource refcount nonzero (1) after lock cleanup; forcing cleanup.
2012-09-26 10:08:14 LustreError: 78995:0:(cl_lock.c:1413:cl_unuse_try()) lock@ffff88082a0540b8[2 4 0 2 0 00000000] W(2):[0, 18446744073709551615]@[0x20097c3a8:0x16a97:0x0] {
2012-09-26 10:08:14 LustreError: 78995:0:(cl_lock.c:1413:cl_unuse_try())     vvp@ffff880773390420: 
2012-09-26 10:08:14 LustreError: 78995:0:(cl_lock.c:1413:cl_unuse_try())     lov@ffff880773391748: 2
2012-09-26 10:08:14 LustreError: 78995:0:(cl_lock.c:1413:cl_unuse_try())     0 0: ---
2012-09-26 10:08:14 LustreError: 78995:0:(cl_lock.c:1413:cl_unuse_try())     1 0: ---
2012-09-26 10:08:14 LustreError: 78995:0:(cl_lock.c:1413:cl_unuse_try()) 
2012-09-26 10:08:14 LustreError: 78995:0:(cl_lock.c:1413:cl_unuse_try()) } lock@ffff88082a0540b8
2012-09-26 10:08:14 LustreError: 78995:0:(cl_lock.c:1413:cl_unuse_try()) unuse return -5
2012-09-26 10:08:14 LustreError: 79024:0:(ldlm_resource.c:755:ldlm_resource_complain()) Resource: ffff880820256ac0 (167630809/0/0/0) (rc: 0)
2012-09-26 10:08:14 Lustre: lc2-OST000c-osc-ffff880831ce8800: Connection restored to lc2-OST000c (at 10.1.1.49@o2ib9)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="45606" author="prakash" created="Wed, 26 Sep 2012 17:20:43 +0000"  >&lt;p&gt;Looking at the server logs, the RX errors from the LNet routers is a bit troubling. The clients and the servers have to go through a set of routers, 10.1.1.53@o2ib9 being the NID of one of the routers:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00010000:00010000:4.0:1348679193.857904:0:20439:0:(ldlm_lockd.c:1069:ldlm_handle_enqueue0()) ### server-side enqueue handler START
00010000:00010000:4.0:1348679193.857909:0:20439:0:(ldlm_lockd.c:1154:ldlm_handle_enqueue0()) ### server-side enqueue handler, new lock created ns: filter-lc2-OST000c_UUID lock: ffff880248b78240/0x8691e72961eb80f5 lrc: 2/0,0 mode: --/PW res: 167630809/0 rrc: 2 type: EXT [0-&amp;gt;0] (req 0-&amp;gt;0) flags: 0x0 remote: 0xf19c08e9d42c4364 expref: -99 pid: 20439 timeout 0
00010000:00010000:4.0:1348679193.857921:0:20439:0:(ldlm_lock.c:626:ldlm_add_bl_work_item()) ### lock incompatible; sending blocking AST. ns: filter-lc2-OST000c_UUID lock: ffff880248b78480/0x8691e72961eb80ee lrc: 2/0,0 mode: PW/PW res: 167630809/0 rrc: 2 type: EXT [0-&amp;gt;18446744073709551615] (req 0-&amp;gt;18446744073709551615) flags: 0x0 remote: 0xbd300c41b4ac0bfd expref: 14 pid: 20439 timeout 0
00010000:00010000:4.0:1348679193.857925:0:20439:0:(ldlm_extent.c:295:ldlm_check_contention()) contended locks = 1
00010000:00010000:4.0:1348679193.857926:0:20439:0:(ldlm_extent.c:295:ldlm_check_contention()) contended locks = 1
00010000:00010000:4.0:1348679193.857931:0:20439:0:(ldlm_lockd.c:809:ldlm_server_blocking_ast()) ### server preparing blocking AST ns: filter-lc2-OST000c_UUID lock: ffff880248b78480/0x8691e72961eb80ee lrc: 3/0,0 mode: PW/PW res: 167630809/0 rrc: 2 type: EXT [0-&amp;gt;18446744073709551615] (req 0-&amp;gt;18446744073709551615) flags: 0x20 remote: 0xbd300c41b4ac0bfd expref: 14 pid: 20439 timeout 0
00010000:00010000:4.0:1348679193.857936:0:20439:0:(ldlm_lockd.c:468:ldlm_add_waiting_lock()) ### adding to wait list(timeout: 100, AT: on) ns: filter-lc2-OST000c_UUID lock: ffff880248b78480/0x8691e72961eb80ee lrc: 4/0,0 mode: PW/PW res: 167630809/0 rrc: 2 type: EXT [0-&amp;gt;18446744073709551615] (req 0-&amp;gt;18446744073709551615) flags: 0x20 remote: 0xbd300c41b4ac0bfd expref: 14 pid: 20439 timeout 7832370900
00010000:00010000:4.0:1348679193.858049:0:20439:0:(ldlm_lockd.c:1288:ldlm_handle_enqueue0()) ### server-side enqueue handler, sending reply(err=0, rc=0) ns: filter-lc2-OST000c_UUID lock: ffff880248b78240/0x8691e72961eb80f5 lrc: 3/0,0 mode: --/PW res: 167630809/0 rrc: 2 type: EXT [0-&amp;gt;18446744073709551615] (req 0-&amp;gt;18446744073709551615) flags: 0x0 remote: 0xf19c08e9d42c4364 expref: 18 pid: 20439 timeout 0
00010000:00010000:4.0:1348679193.858054:0:20439:0:(ldlm_lockd.c:1320:ldlm_handle_enqueue0()) ### server-side enqueue handler END (lock ffff880248b78240, rc 0)
00000800:00000100:0.0:1348679234.951973:0:2588:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00000800:00000100:1.0:1348679234.951992:0:2591:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00000800:00000100:2.0:1348679234.951999:0:2592:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00000800:00000100:4.0:1348679234.952008:0:2594:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00000800:00000100:0.0:1348679234.952008:0:2590:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00000800:00000100:1.0:1348679234.952019:0:2588:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00000800:00000100:2.0:1348679234.952024:0:2587:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00000800:00000100:1.0:1348679234.952034:0:2588:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00000800:00000100:1.0:1348679234.952043:0:2588:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00000800:00000100:1.0:1348679234.952052:0:2588:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00000800:00000100:2.0:1348679234.952057:0:2587:0:(o2iblnd_cb.c:472:kiblnd_rx_complete()) Rx from 10.1.1.53@o2ib9 failed: 5
00010000:00020000:4.1F:1348679293.957439:0:0:0:(ldlm_lockd.c:358:waiting_locks_callback()) ### lock callback timer expired after 100s: evicting client at 192.168.121.15@o2ib2  ns: filter-lc2-OST000c_UUID lock: ffff880248b78480/0x8691e72961eb80ee lrc: 3/0,0 mode: PW/PW res: 167630809/0 rrc: 2 type: EXT [0-&amp;gt;18446744073709551615] (req 0-&amp;gt;18446744073709551615) flags: 0x20 remote: 0xbd300c41b4ac0bfd expref: 4 pid: 20439 timeou
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="45609" author="prakash" created="Wed, 26 Sep 2012 18:14:31 +0000"  >&lt;p&gt;Also, in case it&apos;s useful, I reproduced the issue with both &apos;+dlmtrace&apos; and &apos;+rpctrace&apos;. So:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2030&quot; title=&quot;LDLM timeout resulting in client eviction and truncate system call to fail with EINTR&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2030&quot;&gt;&lt;del&gt;LU-2030&lt;/del&gt;&lt;/a&gt;-llogs.tar.gz   : Has &apos;+dlmtrace&apos;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2030&quot; title=&quot;LDLM timeout resulting in client eviction and truncate system call to fail with EINTR&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2030&quot;&gt;&lt;del&gt;LU-2030&lt;/del&gt;&lt;/a&gt;-llogs-1.tar.gz : Has &apos;+dlmtrace&apos; and &apos;+rpctrace&apos;&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="46561" author="hongchao.zhang" created="Mon, 15 Oct 2012 07:10:18 +0000"  >&lt;p&gt;in the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2030&quot; title=&quot;LDLM timeout resulting in client eviction and truncate system call to fail with EINTR&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2030&quot;&gt;&lt;del&gt;LU-2030&lt;/del&gt;&lt;/a&gt;-llogs-1 (+dlmtrace, +rpctrace), the &quot;simul_truncate.0&quot; (PID=112500) failed,&lt;br/&gt;
00000002:00010000:9.0:1348694770.130724:0:112500:0:(mdc_locks.c:918:mdc_intent_lock()) (name: simul_truncate.0,&lt;span class=&quot;error&quot;&gt;&amp;#91;0x20097c3a8:0x16a98:0x0&amp;#93;&lt;/span&gt;) in obj &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20095bc20:0x114c:0x0&amp;#93;&lt;/span&gt;, intent: getattr flags 00&lt;/p&gt;

&lt;p&gt;and it has acquired the ldlm lock &quot;ffff8806636c1b40/0xf19c08e9d42c4459&quot; at OST0013 and try to get the ldlm lock&lt;br/&gt;
&quot;ffff8807ff4d3d80/0xf19c08e9d42c4460&quot; at OST0010, which has spent more than 100s to get the lock! the the eviction occur&lt;/p&gt;</comment>
                            <comment id="46719" author="hongchao.zhang" created="Thu, 18 Oct 2012 07:05:25 +0000"  >&lt;p&gt;status update: the possible patch is under creation &amp;amp; test&lt;/p&gt;</comment>
                            <comment id="46741" author="prakash" created="Thu, 18 Oct 2012 13:14:32 +0000"  >&lt;p&gt;Good to hear. Thanks!&lt;/p&gt;</comment>
                            <comment id="46884" author="hongchao.zhang" created="Wed, 24 Oct 2012 18:48:19 +0000"  >&lt;p&gt;the patch is tracked at &lt;a href=&quot;http://review.whamcloud.com/#change,4382&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,4382&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="46887" author="hongchao.zhang" created="Wed, 24 Oct 2012 18:56:48 +0000"  >&lt;p&gt;Hi Prakash, &lt;br/&gt;
Is the issue reproducible? there is still one more issue in the logs, why OST0010 took so much time to enqueue this glimpse lock?&lt;br/&gt;
is the log of the node containing lc2-OST0010 available?&lt;/p&gt;</comment>
                            <comment id="46915" author="prakash" created="Thu, 25 Oct 2012 13:23:33 +0000"  >&lt;p&gt;Yes, the issue was reproducible. I&apos;ll try to allocate some time to get on the machine and gather the logs from all 4 OSS nodes for this filesystem. Any debug flags of special interest?&lt;/p&gt;</comment>
                            <comment id="47238" author="hongchao.zhang" created="Thu, 1 Nov 2012 03:50:36 +0000"  >&lt;p&gt;Hi, the same as before, +rpctrace and +dlmtrace, thanks a lot!&lt;/p&gt;</comment>
                            <comment id="139707" author="hongchao.zhang" created="Fri, 22 Jan 2016 05:17:03 +0000"  >&lt;p&gt;Hi Prakash,&lt;br/&gt;
Do you need any more works on this ticket? Or can we closed it?&lt;br/&gt;
Thanks&lt;/p&gt;</comment>
                            <comment id="139781" author="morrone" created="Fri, 22 Jan 2016 19:41:35 +0000"  >&lt;p&gt;Prakash hasn&apos;t worked on Lustre in some time.  This is probably another &quot;Won&apos;t Fix&quot; given its age.&lt;/p&gt;</comment>
                            <comment id="139811" author="jfc" created="Sat, 23 Jan 2016 01:15:39 +0000"  >&lt;p&gt;Thanks Chris.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="11938" name="LU-2030-llogs-1.tar.gz" size="1040598" author="prakash" created="Wed, 26 Sep 2012 18:14:31 +0000"/>
                            <attachment id="11937" name="LU-2030-llogs.tar.gz" size="53844" author="prakash" created="Wed, 26 Sep 2012 13:27:20 +0000"/>
                            <attachment id="11936" name="zwicky49-dlmtrace.llog" size="269276" author="prakash" created="Wed, 26 Sep 2012 13:01:34 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzv3w7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4159</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>