<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:12:12 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-967] OSS hangs due to heavey IO loads</title>
                <link>https://jira.whamcloud.com/browse/LU-967</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;please let me know if there are filed similar bugs or situation on here.&lt;/p&gt;</description>
                <environment>lustre-1.8.4ddn2.2, RHEL5.5</environment>
        <key id="12812">LU-967</key>
            <summary>OSS hangs due to heavey IO loads</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="ihara">Shuichi Ihara</reporter>
                        <labels>
                    </labels>
                <created>Fri, 6 Jan 2012 00:12:08 +0000</created>
                <updated>Mon, 16 Jan 2012 09:21:00 +0000</updated>
                            <resolved>Mon, 16 Jan 2012 09:21:00 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="25985" author="adilger" created="Fri, 6 Jan 2012 01:15:41 +0000"  >&lt;p&gt;Ihara, it would be more useful if the bug report actually contained some more information, instead of just an attachment.  Otherwise, it is much more difficult to find this and similar bugs in the future since there is nothing to search for.&lt;/p&gt;</comment>
                            <comment id="25987" author="ihara" created="Fri, 6 Jan 2012 02:31:50 +0000"  >&lt;p&gt;Andreas, sorry about that.&lt;br/&gt;
We saw OSS hangs several times, the attachement is the latest log file when we saw last OSS hangs at Jan/3/2012. As far as I could see the log files, there many slow xxx due to heavy IO load messges, then finally OS dumpes call traces below. After that, OSS was going to be slow down.&lt;/p&gt;

&lt;p&gt;Jan  3 04:31:07 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751926.394880&amp;#93;&lt;/span&gt; Lustre: 16341:0:(client.c:1482:ptlrpc_expire_one_request()) @@@ Request x1388084806175732 sent from work0-OST0030 to NID 10.1.7.17@o2ib 7s ago has timed out (7s prior to deadline).&lt;br/&gt;
Jan  3 04:31:07 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751926.394882&amp;#93;&lt;/span&gt;   req@ffff810583a24000 x1388084806175732/t0 o601-&amp;gt;@:15/16 lens 224/416 e 0 to 1 dl 1325532667 ref 2 fl Rpc:N/0/0 rc 0/0&lt;br/&gt;
Jan  3 04:31:07 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751926.400092&amp;#93;&lt;/span&gt; LustreError: 16341:0:(quota_context.c:699:dqacq_completion()) acquire qunit got error! (rc:-107)&lt;br/&gt;
Jan  3 04:31:08 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751927.401587&amp;#93;&lt;/span&gt; Lustre: 724:0:(quota_interface.c:460:quota_chk_acq_common()) still haven&apos;t managed to acquire quota space from the quota master after 1 retries (err=0, rc=-107)&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.441959&amp;#93;&lt;/span&gt; BUG: soft lockup - CPU#7 stuck for 10s! &lt;span class=&quot;error&quot;&gt;&amp;#91;kiblnd_sd_00:16258&amp;#93;&lt;/span&gt;&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.443208&amp;#93;&lt;/span&gt; CPU 7:&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.443716&amp;#93;&lt;/span&gt; Modules linked in: obdfilter(U) fsfilt_ldiskfs(U) ost(U) mgc(U) ldiskfs(U) jbd2(U) crc16(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) dm_round_robin(U) sg(U) autofs4(U) lockd(U) sunrpc(U) ib_iser(U) libiscsi2(U) scsi_transport_iscsi2(U) scsi_transport_iscsi(U) ib_srp(U) rds(U) ib_sdp(U) ib_ipoib(U) ipoib_helper(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) rdma_ucm(U) rdma_cm(U) ib_ucm(U) ib_uverbs(U) ib_umad(U) ib_cm(U) iw_cm(U) ib_addr(U) ib_sa(U) loop(U) dm_multipath(U) scsi_dh(U) video(U) backlight(U) sbs(U) power_meter(U) hwmon(U) i2c_ec(U) i2c_core(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) mlx4_ib(U) ib_mad(U) ib_core(U) mlx4_en(U) mlx4_core(U) bnx2(U) hpilo(U) serio_raw(U) pcspkr(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_mem_cache(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_log(U) dm_mod(U) shpchp(U) cciss(U) sd_mod&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.463727&amp;#93;&lt;/span&gt; Pid: 16258, comm: kiblnd_sd_00 Tainted: G      2.6.18-194.17.1.el5_lustre.1.8.4.ddn2.2 #1&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.465428&amp;#93;&lt;/span&gt; RIP: 0010:&lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8006f217&amp;gt;&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8006f217&amp;gt;&amp;#93;&lt;/span&gt; __write_lock_failed+0xf/0x20&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.467061&amp;#93;&lt;/span&gt; RSP: 0018:ffff81061460d7c8  EFLAGS: 00000287&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.468135&amp;#93;&lt;/span&gt; RAX: 0000000000006000 RBX: ffff81061460d7d0 RCX: ffff8103b0726000&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.469470&amp;#93;&lt;/span&gt; RDX: ffff8104aa27a2c0 RSI: ffff81061fffe140 RDI: ffffffff8038ebcc&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.470802&amp;#93;&lt;/span&gt; RBP: ffffffff80068c1b R08: 00000000ffffffff R09: 0000000000000000&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.472141&amp;#93;&lt;/span&gt; R10: 000000000000003c R11: 0000000000000000 R12: ffff81061460d780&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.473473&amp;#93;&lt;/span&gt; R13: ffffffff801910ec R14: ffff81061460d730 R15: ffff81061fe43400&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.474806&amp;#93;&lt;/span&gt; FS:  00002b84181b9240(0000) GS:ffff81061fe1fbc0(0000) knlGS:0000000000000000&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.476313&amp;#93;&lt;/span&gt; CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.477404&amp;#93;&lt;/span&gt; CR2: 000000001c587158 CR3: 0000000000201000 CR4: 00000000000006e0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.478807&amp;#93;&lt;/span&gt;&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.478807&amp;#93;&lt;/span&gt; Call Trace:&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.479610&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff80072252&amp;gt;&amp;#93;&lt;/span&gt; _write_lock+0x12/0x20&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.480607&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff800f2b1a&amp;gt;&amp;#93;&lt;/span&gt; __get_vm_area_node+0xba/0x1b0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.481731&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff800f2c30&amp;gt;&amp;#93;&lt;/span&gt; get_vm_area_node+0x20/0x30&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.482799&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff800f3438&amp;gt;&amp;#93;&lt;/span&gt; __vmalloc_node+0x48/0x70&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.483845&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff800f348e&amp;gt;&amp;#93;&lt;/span&gt; __vmalloc+0xe/0x10&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.484797&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff800f3538&amp;gt;&amp;#93;&lt;/span&gt; vmalloc+0x18/0x20&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.485742&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8861e3c9&amp;gt;&amp;#93;&lt;/span&gt; :libcfs:cfs_alloc_large+0x9/0x10&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.486980&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff88809ede&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_create_tx_pool+0x95e/0x1020&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.488329&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8880684b&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_pool_alloc_node+0x18b/0x260&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.489676&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8880f898&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_get_idle_tx+0x18/0x1d0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.490953&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff888106bd&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_check_sends+0x28d/0x4a0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.492245&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff88258a4a&amp;gt;&amp;#93;&lt;/span&gt; :mlx4_ib:mlx4_ib_post_recv+0x1ea/0x200&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.493502&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8881156b&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_post_rx+0x2cb/0x2f0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.494761&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8866d47d&amp;gt;&amp;#93;&lt;/span&gt; :lnet:lnet_enq_event_locked+0x9d/0xd0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.496090&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff888119de&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_recv+0x44e/0x470&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.497404&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8880ea32&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_init_tx_msg+0x152/0x1c0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.498779&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8867054f&amp;gt;&amp;#93;&lt;/span&gt; :lnet:lnet_ni_recv+0x1ef/0x220&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.499994&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8867066b&amp;gt;&amp;#93;&lt;/span&gt; :lnet:lnet_recv_put+0xeb/0x110&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.501203&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff88674892&amp;gt;&amp;#93;&lt;/span&gt; :lnet:lnet_parse+0x1012/0x1ab0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.502416&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff888104d6&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_check_sends+0xa6/0x4a0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.503770&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff88812027&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_handle_rx+0x4c7/0x520&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.505112&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff88808cb2&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_unpack_msg+0x372/0x6a0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.506470&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff88812418&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_rx_complete+0x2b8/0x350&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.507906&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff800a1c3d&amp;gt;&amp;#93;&lt;/span&gt; default_wake_function+0xd/0x10&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.509124&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff88812581&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_complete+0xd1/0xe0&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.510415&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff88818302&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_scheduler+0x4f2/0x690&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.833105&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff801910ec&amp;gt;&amp;#93;&lt;/span&gt; list_add+0xc/0x10&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.834140&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff800a1c30&amp;gt;&amp;#93;&lt;/span&gt; default_wake_function+0x0/0x10&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.835302&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8006afb1&amp;gt;&amp;#93;&lt;/span&gt; child_rip+0xa/0x11&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.836334&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff88817e10&amp;gt;&amp;#93;&lt;/span&gt; :ko2iblnd:kiblnd_scheduler+0x0/0x690&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.837555&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8006afa7&amp;gt;&amp;#93;&lt;/span&gt; child_rip+0x0/0x11&lt;br/&gt;
Jan  3 04:31:10 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751929.838510&amp;#93;&lt;/span&gt;&lt;br/&gt;
Jan  3 04:31:15 t2s007059 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;1751934.402464&amp;#93;&lt;/span&gt; Lustre: 16341:0:(client.c:1482:ptlrpc_expire_one_request()) @@@ Request x1388084806175733 sent from work0-OST0030 to NID 10.1.7.17@o2ib 7s ago has timed out (7s prior to deadline).&lt;/p&gt;</comment>
                            <comment id="26001" author="pjones" created="Fri, 6 Jan 2012 09:23:20 +0000"  >&lt;p&gt;Lai&lt;/p&gt;

&lt;p&gt;Can you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="26130" author="ihara" created="Sun, 8 Jan 2012 01:30:28 +0000"  >&lt;p&gt;Might be related to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-952&quot; title=&quot;Hung thread with HIGH OSS load&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-952&quot;&gt;&lt;del&gt;LU-952&lt;/del&gt;&lt;/a&gt;?&lt;/p&gt;</comment>
                            <comment id="26332" author="laisiyao" created="Wed, 11 Jan 2012 02:38:31 +0000"  >&lt;p&gt;Hi Shuichi, it looks quite possible to be the same issue of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-952&quot; title=&quot;Hung thread with HIGH OSS load&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-952&quot;&gt;&lt;del&gt;LU-952&lt;/del&gt;&lt;/a&gt;, since there is a fix for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-952&quot; title=&quot;Hung thread with HIGH OSS load&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-952&quot;&gt;&lt;del&gt;LU-952&lt;/del&gt;&lt;/a&gt;, could you verify that works?&lt;/p&gt;</comment>
                            <comment id="26333" author="laisiyao" created="Wed, 11 Jan 2012 04:11:27 +0000"  >&lt;p&gt;If this still happens, it may be the chain reaction of slow disk I/O:&lt;br/&gt;
  slow disk I/O --&amp;gt; slow journal --&amp;gt; slow memory reclamation --&amp;gt; slow memory allocation --&amp;gt; all operations hung&lt;/p&gt;

&lt;p&gt;And under this situation, could you try with read cache disabled?&lt;/p&gt;</comment>
                            <comment id="26347" author="ihara" created="Wed, 11 Jan 2012 08:39:01 +0000"  >&lt;p&gt;Hi Lai,&lt;/p&gt;

&lt;p&gt;we applied the patch for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-952&quot; title=&quot;Hung thread with HIGH OSS load&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-952&quot;&gt;&lt;del&gt;LU-952&lt;/del&gt;&lt;/a&gt; and didn&apos;t see issue so far.&lt;br/&gt;
I&apos;m attaching the newer log messages on OSS. You can see final reboot time (after attached the patch) on the attached log file below.&lt;/p&gt;

&lt;p&gt;Jan  9 14:17:01 t2s007053 kernel: BIOS-provided physical RAM map:&lt;br/&gt;
Jan  9 14:17:01 t2s007053 kernel:  BIOS-e820: 0000000000010000 - 000000000009f400 (usable)&lt;br/&gt;
Jan  9 14:17:01 t2s007053 kernel:  BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)&lt;br/&gt;
Jan  9 14:17:01 t2s007053 kernel:  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)&lt;br/&gt;
Jan  9 14:17:01 t2s007053 kernel:  BIOS-e820: 0000000000100000 - 00000000df62f000 (usable)&lt;br/&gt;
Jan  9 14:17:01 t2s007053 kernel:  BIOS-e820: 00000000df62f000 - 00000000df63c000 (ACPI data)&lt;br/&gt;
Jan  9 14:17:01 t2s007053 kernel:  BIOS-e820: 00000000df63c000 - 00000000df63d000 (usable)&lt;br/&gt;
Jan  9 14:17:01 t2s007053 kernel:  BIOS-e820: 00000000df63d000 - 00000000e4000000 (reserved)&lt;br/&gt;
Jan  9 14:17:01 t2s007053 kernel:  BIOS-e820: 00000000fec00000 - 00000000fee10000 (reserved)&lt;br/&gt;
Jan  9 14:17:01 t2s007053 kernel:  BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)&lt;br/&gt;
Jan  9 14:17:01 t2s007053 kernel:  BIOS-e820: 000000010000000&lt;/p&gt;

&lt;p&gt;Could you please have a look at the log file and find out if any concerns are still existing?&lt;/p&gt;

&lt;p&gt;Thanks!&lt;/p&gt;
</comment>
                            <comment id="26415" author="laisiyao" created="Thu, 12 Jan 2012 01:23:04 +0000"  >&lt;p&gt;Hmm, the new log shows may operations blocked on IO, but not on memory allocation blocking any more, and in the end the operations can finish.&lt;/p&gt;</comment>
                            <comment id="26617" author="ihara" created="Mon, 16 Jan 2012 06:52:08 +0000"  >
&lt;p&gt;Lai, still any issues are remained?&lt;br/&gt;
we haven&apos;t seen issue happens since we applied patch for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-952&quot; title=&quot;Hung thread with HIGH OSS load&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-952&quot;&gt;&lt;del&gt;LU-952&lt;/del&gt;&lt;/a&gt;. &lt;/p&gt;
</comment>
                            <comment id="26621" author="laisiyao" created="Mon, 16 Jan 2012 09:21:00 +0000"  >&lt;p&gt;This is a duplicate of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-952&quot; title=&quot;Hung thread with HIGH OSS load&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-952&quot;&gt;&lt;del&gt;LU-952&lt;/del&gt;&lt;/a&gt;, which has been fixed.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="10732" name="t2s007053.messages.gz" size="380131" author="ihara" created="Wed, 11 Jan 2012 08:39:53 +0000"/>
                            <attachment id="10718" name="t2s007059_messages.gz" size="927043" author="ihara" created="Fri, 6 Jan 2012 00:12:08 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvhkn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6496</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>