<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:28:24 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2812] parallel-scale test_compilebench hung: task ldlm_poold:16196 blocked for more than 120 seconds</title>
                <link>https://jira.whamcloud.com/browse/LU-2812</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The async journal commit feature and cancel lock before replay feature are disabled by default on Lustre b1_8 branch. After enabling them, running parallel-scale compilebench test hung as follows:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;== parallel-scale test compilebench: compilebench == 09:24:53 (1360776293)
OPTIONS:
cbench_DIR=/usr/bin
cbench_IDIRS=2
cbench_RUNS=2
client-24vm1
client-24vm2.lab.whamcloud.com
./compilebench -D /mnt/lustre/d0.compilebench -i 2         -r 2 --makej
using working directory /mnt/lustre/d0.compilebench, 2 intial dirs 2 runs
native unpatched native-0 222MB in 522.93 seconds (0.43 MB/s)
native patched native-0 109MB in 424.16 seconds (0.26 MB/s)
native patched compiled native-0 691MB in 69.30 seconds (9.98 MB/s)
create dir kernel-0 222MB in 1110.55 seconds (0.20 MB/s)
create dir kernel-1 222MB in 3047.05 seconds (0.07 MB/s)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Console log on the client node showed that:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;09:24:59:Lustre: DEBUG MARKER: == parallel-scale test compilebench: compilebench == 09:24:53 (1360776293)
09:24:59:Lustre: DEBUG MARKER: /usr/sbin/lctl mark .\/compilebench -D \/mnt\/lustre\/d0.compilebench -i 2         -r 2 --makej
09:24:59:Lustre: DEBUG MARKER: ./compilebench -D /mnt/lustre/d0.compilebench -i 2 -r 2 --makej
09:33:01:INFO: task ldlm_poold:16196 blocked for more than 120 seconds.
09:33:01:&quot;echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs&quot; disables this message.
09:33:01:ldlm_poold    D 0000000000000000     0 16196      2 0x00000080
09:33:01: ffff88000e4bd9b0 0000000000000046 ffff88000e4bd960 ffffffff810097cc
09:33:01: ffff8800117460b8 0000000000000000 00000000004bd970 ffff880002214200
09:33:01: ffff880037871058 ffff88000e4bdfd8 000000000000fb88 ffff880037871058
09:33:01:Call Trace:
09:33:01: [&amp;lt;ffffffff810097cc&amp;gt;] ? __switch_to+0x1ac/0x320
09:33:01: [&amp;lt;ffffffff814e9c50&amp;gt;] ? thread_return+0x4e/0x76e
09:33:01: [&amp;lt;ffffffff814eaac5&amp;gt;] schedule_timeout+0x215/0x2e0
09:33:01: [&amp;lt;ffffffff8105f8ac&amp;gt;] ? try_to_wake_up+0x24c/0x3e0
09:33:01: [&amp;lt;ffffffff814ea743&amp;gt;] wait_for_common+0x123/0x180
09:33:01: [&amp;lt;ffffffff8105fa40&amp;gt;] ? default_wake_function+0x0/0x20
09:33:01: [&amp;lt;ffffffff814ea85d&amp;gt;] wait_for_completion+0x1d/0x20
09:33:01: [&amp;lt;ffffffffa04dddcd&amp;gt;] __ldlm_bl_to_thread+0x19d/0x1b0 [ptlrpc]
09:33:01: [&amp;lt;ffffffffa04d672b&amp;gt;] ? ldlm_cli_cancel_local+0xab/0x350 [ptlrpc]
09:33:01: [&amp;lt;ffffffffa04e35b9&amp;gt;] ldlm_bl_to_thread+0x379/0x5f0 [ptlrpc]
09:33:01: [&amp;lt;ffffffffa04d88e1&amp;gt;] ? ldlm_cancel_list+0xf1/0x240 [ptlrpc]
09:33:01: [&amp;lt;ffffffffa04e384e&amp;gt;] ldlm_bl_to_thread_list+0x1e/0xa0 [ptlrpc]
09:33:01: [&amp;lt;ffffffffa04d999a&amp;gt;] ldlm_cancel_lru+0x7a/0x1f0 [ptlrpc]
09:33:01: [&amp;lt;ffffffff814e9c50&amp;gt;] ? thread_return+0x4e/0x76e
09:33:01: [&amp;lt;ffffffffa04ea36c&amp;gt;] ldlm_cli_pool_recalc+0x1fc/0x2a0 [ptlrpc]
09:33:01: [&amp;lt;ffffffff8107d4eb&amp;gt;] ? try_to_del_timer_sync+0x7b/0xe0
09:33:02: [&amp;lt;ffffffffa04ea508&amp;gt;] ldlm_pool_recalc+0xf8/0x130 [ptlrpc]
09:33:02: [&amp;lt;ffffffffa04eb0ec&amp;gt;] ldlm_pools_recalc+0x9c/0x2d0 [ptlrpc]
09:33:02: [&amp;lt;ffffffffa04ec714&amp;gt;] ldlm_pools_thread_main+0xb4/0x2f0 [ptlrpc]
09:33:02: [&amp;lt;ffffffff8105fa40&amp;gt;] ? default_wake_function+0x0/0x20
09:33:02: [&amp;lt;ffffffff8100c0ca&amp;gt;] child_rip+0xa/0x20
09:33:02: [&amp;lt;ffffffffa04ec660&amp;gt;] ? ldlm_pools_thread_main+0x0/0x2f0 [ptlrpc]
09:33:02: [&amp;lt;ffffffff8100c0c0&amp;gt;] ? child_rip+0x0/0x20
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Maloo report: &lt;a href=&quot;https://maloo.whamcloud.com/test_sets/83638de2-7667-11e2-bc2f-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/test_sets/83638de2-7667-11e2-bc2f-52540035b04c&lt;/a&gt;&lt;/p&gt;</description>
                <environment>Lustre Tag: v1_8_9_WC1_RC1&lt;br/&gt;
Lustre Build: &lt;a href=&quot;http://build.whamcloud.com/job/lustre-b1_8/256&quot;&gt;http://build.whamcloud.com/job/lustre-b1_8/256&lt;/a&gt;&lt;br/&gt;
Distro/Arch: RHEL5.9/x86_64(server), RHEL6.3/x86_64(client)&lt;br/&gt;
Network: TCP (1GigE)&lt;br/&gt;
ENABLE_QUOTA=yes&lt;br/&gt;
&lt;br/&gt;
The async journal commit feature and cancel lock before replay feature are enabled:&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/1526&quot;&gt;http://review.whamcloud.com/1526&lt;/a&gt;&lt;br/&gt;
&lt;br/&gt;
filter-&amp;gt;fo_syncjournal = 0;&lt;br/&gt;
ldlm_cancel_unused_locks_before_replay = 1;&lt;br/&gt;
</environment>
        <key id="17575">LU-2812</key>
            <summary>parallel-scale test_compilebench hung: task ldlm_poold:16196 blocked for more than 120 seconds</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="6" iconUrl="https://jira.whamcloud.com/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="2">Won&apos;t Fix</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="yujian">Jian Yu</reporter>
                        <labels>
                    </labels>
                <created>Thu, 14 Feb 2013 07:53:42 +0000</created>
                <updated>Sun, 14 Aug 2016 17:23:02 +0000</updated>
                            <resolved>Sun, 14 Aug 2016 17:23:02 +0000</resolved>
                                    <version>Lustre 1.8.9</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="52362" author="yujian" created="Thu, 14 Feb 2013 07:55:14 +0000"  >&lt;p&gt;It seems this is related to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1376&quot; title=&quot;ldlm_poold noise on clients significantly reduces applification performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1376&quot;&gt;LU-1376&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="52381" author="green" created="Thu, 14 Feb 2013 11:48:54 +0000"  >&lt;p&gt;I suspect this i the same problem as &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-874&quot; title=&quot;Client eviction on lock callback timeout &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-874&quot;&gt;&lt;del&gt;LU-874&lt;/del&gt;&lt;/a&gt;, with this particular patch aimed at addressing it: &lt;a href=&quot;http://review.whamcloud.com/1900&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/1900&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="161847" author="simmonsja" created="Sun, 14 Aug 2016 17:23:02 +0000"  >&lt;p&gt;Old blocker for unsupported version&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvj4v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6814</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>