<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:35:44 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-3650] osc_ldlm_blocking_ast() LBUG</title>
                <link>https://jira.whamcloud.com/browse/LU-3650</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We are hitting the LBUG in osc_ldlm_blocking_ast() that says &quot;This should never happen&quot;.  The backtrace looks like:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;PID: 29660  TASK: c00000098d26bab0  CPU: 42  COMMAND: &quot;ldlm_bl_44&quot;
 #0 [c000000dc9f3f7c0] .crash_kexec at c0000000000e5aa4
 #1 [c000000dc9f3f9c0] .panic at c0000000005c4f40
 #2 [c000000dc9f3fa50] .lbug_with_loc at d00000000acd14e0 [libcfs]
 #3 [c000000dc9f3fae0] .osc_ldlm_blocking_ast at d00000000afbe1d8 [osc]
 #4 [c000000dc9f3fbc0] .ldlm_cancel_callback at d00000000bf3148c [ptlrpc]
 #5 [c000000dc9f3fc60] .ldlm_cli_cancel_local at d00000000bf4db78 [ptlrpc]
 #6 [c000000dc9f3fd20] .ldlm_cli_cancel_list_local at d00000000bf4f80c [ptlrpc]
 #7 [c000000dc9f3fe40] .ldlm_bl_thread_main at d00000000bf5c07c [ptlrpc]
 #8 [c000000dc9f3ff90] .kernel_thread at c000000000032fd4
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Both times we have seen this (that I am aware of), the LBUG was preceded shortly before by a kernel page allocation error, so that may be part of the recipe for triggering this bug.&lt;/p&gt;

&lt;p&gt;This is a PPC64 client running lustre &lt;a href=&quot;http://github.com/chaos/lustre/tree/2.4.0-RC1_3chaos&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;2.4.0-RC1_3chaos&lt;/a&gt;.&lt;/p&gt;
</description>
                <environment>Lustre [2.4.0-RC1_3chaos|&lt;a href=&quot;http://github.com/chaos/lustre/tree/2.4.0-RC1_3chaos&quot;&gt;http://github.com/chaos/lustre/tree/2.4.0-RC1_3chaos&lt;/a&gt;], PPC64 client</environment>
        <key id="20029">LU-3650</key>
            <summary>osc_ldlm_blocking_ast() LBUG</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="morrone">Christopher Morrone</reporter>
                        <labels>
                            <label>llnl</label>
                    </labels>
                <created>Sat, 27 Jul 2013 00:34:39 +0000</created>
                <updated>Tue, 14 Dec 2021 22:56:09 +0000</updated>
                            <resolved>Tue, 14 Dec 2021 22:56:09 +0000</resolved>
                                    <version>Lustre 2.4.1</version>
                    <version>Lustre 2.5.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="63085" author="keith" created="Sat, 27 Jul 2013 01:54:34 +0000"  >&lt;p&gt;Do you have the page allocation error? Do you know which allocations fail right before the crash?&lt;/p&gt;</comment>
                            <comment id="63091" author="pjones" created="Sat, 27 Jul 2013 13:37:38 +0000"  >&lt;p&gt;Lai&lt;/p&gt;

&lt;p&gt;Do you have anything else to add?&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="63173" author="morrone" created="Mon, 29 Jul 2013 17:45:02 +0000"  >&lt;p&gt;This was the most recent failure:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2013-07-26 14:41:40 ib0: failed to allocate receive buffer 75
2013-07-26 14:42:57 ib0: Reap connection for gid fe80:0000:0000:0000:0002:c903:004b:150d
2013-07-26 14:42:57 ib0: neigh free for 000048 fe80:0000:0000:0000:0002:c903:004b:150d
2013-07-26 14:42:57 ib0: Destroy active connection 0x5fa head 0x91b7 tail 0x91b7
2013-07-26 14:43:30 ib0: Request connection 0x5fc for gid fe80:0000:0000:0000:0002:c903:004b:150d qpn 0x48
2013-07-26 14:43:30 ib0: REP received.
2013-07-26 14:43:30 ib0: REQ arrived
2013-07-26 14:47:07 ldlm_bl_44: page allocation failure. order:0, mode:0x50
2013-07-26 14:47:07 Call Trace:
2013-07-26 14:47:07 [c000000dc9f3f2f0] [c000000000012f04] .show_stack+0x74/0x1c0 (unreliable)
2013-07-26 14:47:07 [c000000dc9f3f3a0] [c0000000001669e0] .__alloc_pages_nodemask+0x640/0x8c0
2013-07-26 14:47:07 [c000000dc9f3f540] [c0000000001abf9c] .kmem_getpages+0x7c/0x1a0
2013-07-26 14:47:07 [c000000dc9f3f5d0] [c0000000001ad3a4] .fallback_alloc+0x244/0x310
2013-07-26 14:47:07 [c000000dc9f3f6b0] [c0000000001ae31c] .__kmalloc+0x1fc/0x260
2013-07-26 14:47:07 [c000000dc9f3f770] [d00000000acd2724] .cfs_alloc+0x34/0x80 [libcfs]
2013-07-26 14:47:07 [c000000dc9f3f800] [d00000000ba74c14] .lu_context_init+0x84/0x440 [obdclass]
2013-07-26 14:47:07 [c000000dc9f3f8c0] [d00000000ba80fa4] .cl_env_new+0x234/0x510 [obdclass]
2013-07-26 14:47:07 [c000000dc9f3f9a0] [d00000000ba814c8] .cl_env_get+0x88/0x230 [obdclass]
2013-07-26 14:47:07 [c000000dc9f3fa50] [d00000000ba816fc] .cl_env_nested_get+0x8c/0xf0 [obdclass]
2013-07-26 14:47:07 [c000000dc9f3fae0] [d00000000afbde78] .osc_ldlm_blocking_ast+0x58/0x4f0 [osc]
2013-07-26 14:47:07 [c000000dc9f3fbc0] [d00000000bf3148c] .ldlm_cancel_callback+0x9c/0x310 [ptlrpc]
2013-07-26 14:47:07 [c000000dc9f3fc60] [d00000000bf4db78] .ldlm_cli_cancel_local+0xd8/0x7d0 [ptlrpc]
2013-07-26 14:47:07 [c000000dc9f3fd20] [d00000000bf4f80c] .ldlm_cli_cancel_list_local+0x14c/0x410 [ptlrpc]
2013-07-26 14:47:07 [c000000dc9f3fe40] [d00000000bf5c07c] .ldlm_bl_thread_main+0x19c/0x610 [ptlrpc]
2013-07-26 14:47:07 [c000000dc9f3ff90] [c000000000032fd4] .kernel_thread+0x54/0x70
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It looks like a hit a while back was preceded by several overlapping allocation failures, so its a little harder to read:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2013-05-17 10:35:01 ldlm_bl_145: page allocation failure. order:0, mode:0x50
2013-05-17 10:35:01 Call Trace:
2013-05-17 10:35:01 [c00000032b1db180] [c000000000012f04] .show_stack+0x74/0x1c0 (unreliable)
2013-05-17 10:35:01 [c00000032b1db230] [c0000000001669e0] .__alloc_pages_nodemask+0x640/0x8c0
2013-05-17 10:35:01 [c00000032b1db3d0] [c0000000001abf9c] .kmem_getpages+0x7c/0x1a0
2013-05-17 10:35:01 [c00000032b1db460] [c0000000001ad3a4] .fallback_alloc+0x244/0x310
2013-05-17 10:35:01 [c00000032b1db540] [c0000000001af1cc] .kmem_cache_alloc+0x18c/0x220
2013-05-17 10:35:01 [c00000032b1db5f0] [d00000000aa72588] .cfs_mem_cache_alloc+0x28/0x40 [libcfs]
2013-05-17 10:35:01 [c00000032b1db670] [d00000000c0b2cec] .osc_session_init+0x5c/0x350 [osc]
2013-05-17 10:35:01 [c00000032b1db730] [d00000000b8534b0] .keys_fill+0xd0/0x280 [obdclass]
2013-05-17 10:35:01 ldlm_bl_08: page allocation failure. order:0, mode:0x50
2013-05-17 10:35:01 ldlm_bl_121: page allocation failure. order:0, mode:0x50
2013-05-17 10:35:01 Call Trace:
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c00000032b1db800] [d00000000b857dc8] .lu_context_init+0x108/0x440 [obdclass]
2013-05-17 10:35:01 [c00000032b1db8c0] [d00000000b8638a4] .cl_env_new+0x234/0x510 [obdclass]
2013-05-17 10:35:01 ldlm_bl_01: page allocation failure. order:0, mode:0x50
2013-05-17 10:35:01 ldlm_bl_32: page allocation failure. order:0, mode:0x50
2013-05-17 10:35:01 Call Trace:
2013-05-17 10:35:01 Call Trace:
2013-05-17 10:35:01 [c000000dd2903180] [c000000000012f04] .show_stack+0x74/0x1c0 (unreliable)
2013-05-17 10:35:01 [c000000dd2903230] [c0000000001669e0] .__alloc_pages_nodemask+0x640/0x8c0
2013-05-17 10:35:01 [c000000c1d9ef180] [c000000000012f04] .show_stack+0x74/0x1c0 (unreliable)
2013-05-17 10:35:01 [c00000032b1db9a0] [d00000000b863dc8] .cl_env_get+0x88/0x230 [obdclass]
2013-05-17 10:35:01 Call Trace:
2013-05-17 10:35:01 [c000000c1d9ef230] [c0000000001669e0] .__alloc_pages_nodemask+0x640/0x8c0
2013-05-17 10:35:01 [c000000f4e87f180] [c000000000012f04] .show_stack+0x74/0x1c0 (unreliable)
2013-05-17 10:35:01 [c000000f4a17f180] [c000000000012f04] .show_stack+0x74/0x1c0[c000000dd29033d0] [c0000000001abf9c] .kmem_getpages+0x7c/0x1a0
2013-05-17 10:35:01 [c000000dd2903460] [c0000000001ad3a4] .fallback_alloc+0x244/0x310
2013-05-17 10:35:01 [c000000dd2903540] [c0000000001af1cc] .kmem_cache_alloc+0x18c/0x220
2013-05-17 10:35:01 [c000000dd29035f0] [d00000000aa72588] .cfs_mem_cache_alloc+0x28/0x40 [libcfs] (unreliable)
2013-05-17 10:35:01 [c000000f4e87f230] [c0000000001669e0] .__alloc_pages_nodemask+0x640/0x8c0
2013-05-17 10:35:01 [c000000f4e87f3d0] [c0000000001abf9c] .kmem_getpages+0x7c/0x1a0
2013-05-17 10:35:01 [c000000f4e87f460] [c0000000001ad3a4] .fallback_alloc+0x244/0x310
2013-05-17 10:35:01 [c000000f4e87f540] [c0000000001af1cc] .kmem_cache_alloc+0x18c/0x220
2013-05-17 10:35:01 [c000000c1d9ef3d0] [c0000000001abf9c] .kmem_getpages+0x7c/0x1a0
2013-05-17 10:35:01 [c000000c1d9ef460] [c0000000001ad3a4] .fallback_alloc+0x244/0x310
2013-05-17 10:35:01 [c000000c1d9ef540] [c0000000001af1cc] .kmem_cache_alloc+0x18c/0x220
2013-05-17 10:35:01 [c000000c1d9ef5f0] [d00000000aa72588] .cfs_mem_cache_alloc+0x28/0x40 [libcfs][c00000032b1dba50] [d00000000b863ffc] .cl_env_nested_get+0x8c/0xf0 [obdclass]
2013-05-17 10:35:01 ldlm_bl_38: page allocation failure. order:0, mode:0x50
2013-05-17 10:35:01 
2013-05-17 10:35:01 Call Trace:
2013-05-17 10:35:01 [c000000c1d9ef670] [d00000000c0b2cec] .osc_session_init+0x5c/0x350 [osc][c000000ddb6d7180] [c000000000012f04] .show_stack+0x74/0x1c0
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000f4e87f5f0] [d00000000aa72588] .cfs_mem_cache_alloc+0x28/0x40 [libcfs]
2013-05-17 10:35:01 [c000000f4e87f670] [d00000000c0b2cec] .osc_session_init+0x5c/0x350 [osc]
2013-05-17 10:35:01 [c00000032b1dbae0] [d00000000c0bcaf8] .osc_ldlm_blocking_ast+0x58/0x4f0 [osc]
2013-05-17 10:35:01 [c00000032b1dbbc0] [d00000000bd9148c] .ldlm_cancel_callback+0x9c/0x310 [ptlrpc]
2013-05-17 10:35:01 
2013-05-17 10:35:01  (unreliable)
2013-05-17 10:35:01 [c000000f4e87f730] [d00000000b8534b0] .keys_fill+0xd0/0x280 [obdclass][c000000ddb6d7230] [c0000000001669e0] .__alloc_pages_nodemask+0x640/0x8c0
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000c1d9ef730] [d00000000b8534b0] .keys_fill+0xd0/0x280 [obdclass][c000000f4a17f230] [c0000000001669e0] .__alloc_pages_nodemask+0x640/0x8c0[c00000032b1dbc60] [d00000000bdadd48] .ldlm_cli_cancel_local+0xd8/0x7d0 [ptlrpc]
2013-05-17 10:35:01 [c000000f4e87f800] [d00000000b857dc8] .lu_context_init+0x108/0x440 [obdclass][c00000032b1dbd20] [d00000000bdaf9dc] .ldlm_cli_cancel_list_local+0x14c/0x410 [ptlrpc]
2013-05-17 10:35:01 
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000ddb6d73d0] [c0000000001abf9c] .kmem_getpages+0x7c/0x1a0
2013-05-17 10:35:01 [c000000ddb6d7460] [c0000000001ad3a4] .fallback_alloc+0x244/0x310[c000000f4a17f3d0] [c0000000001abf9c] .kmem_getpages+0x7c/0x1a0
2013-05-17 10:35:01 [c000000dd2903670] [d00000000c0b2cec] .osc_session_init+0x5c/0x350 [osc][c000000f4a17f460] [c0000000001ad3a4] .fallback_alloc+0x244/0x310
2013-05-17 10:35:01 
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000dd2903730] [d00000000b8534b0] .keys_fill+0xd0/0x280 [obdclass]
2013-05-17 10:35:01 [c000000dd2903800] [d00000000b857dc8] .lu_context_init+0x108/0x440 [obdclass][c000000f4e87f8c0] [d00000000b8638a4] .cl_env_new+0x234/0x510 [obdclass]
2013-05-17 10:35:01 [c000000dd29038c0] [d00000000b8638a4] .cl_env_new+0x234/0x510 [obdclass]
2013-05-17 10:35:01 

2013-05-17 10:35:01 [c000000c1d9ef800] [d00000000b857dc8] .lu_context_init+0x108/0x440 [obdclass][c000000ddb6d7540] [c0000000001af1cc] .kmem_cache_alloc+0x18c/0x220
2013-05-17 10:35:01 
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000c1d9ef8c0] [d00000000b8638a4] .cl_env_new+0x234/0x510 [obdclass][c000000ddb6d75f0] [d00000000aa72588] .cfs_mem_cache_alloc+0x28/0x40 [libcfs]
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000c1d9ef9a0] [d00000000b863dc8] .cl_env_get+0x88/0x230 [obdclass]
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000ddb6d7670] [d00000000c0b2cec] .osc_session_init+0x5c/0x350 [osc][c000000c1d9efa50] [d00000000b863ffc] .cl_env_nested_get+0x8c/0xf0 [obdclass]
2013-05-17 10:35:01 [c00000032b1dbe40] [d00000000bdbc24c] .ldlm_bl_thread_main+0x19c/0x610 [ptlrpc]
2013-05-17 10:35:01 [c000000dd29039a0] [d00000000b863dc8] .cl_env_get+0x88/0x230 [obdclass][c000000ddb6d7730] [d00000000b8534b0] .keys_fill+0xd0/0x280 [obdclass]
2013-05-17 10:35:01 [c000000f4e87f9a0] [d00000000b863dc8] .cl_env_get+0x88/0x230 [obdclass][c000000dd2903a50] [d00000000b863ffc] .cl_env_nested_get+0x8c/0xf0 [obdclass]
2013-05-17 10:35:01 [c000000f4a17f540] [c0000000001af1cc] .kmem_cache_alloc+0x18c/0x220
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c00000032b1dbf90] [c000000000032fd4] .kernel_thread+0x54/0x70
2013-05-17 10:35:01 [c000000dd2903ae0] [d00000000c0bcaf8] .osc_ldlm_blocking_ast+0x58/0x4f0 [osc]
2013-05-17 10:35:01 [c000000dd2903bc0] [d00000000bd9148c] .ldlm_cancel_callback+0x9c/0x310 [ptlrpc][c000000f4e87fa50] [d00000000b863ffc] .cl_env_nested_get+0x8c/0xf0 [obdclass]
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000ddb6d7800] [d00000000b857dc8] .lu_context_init+0x108/0x440 [obdclass][c000000c1d9efae0] [d00000000c0bcaf8] .osc_ldlm_blocking_ast+0x58/0x4f0 [osc]
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000ddb6d78c0] [d00000000b8638a4] .cl_env_new+0x234/0x510 [obdclass][c000000c1d9efbc0] [d00000000bd9148c] .ldlm_cancel_callback+0x9c/0x310 [ptlrpc]
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000ddb6d79a0] [d00000000b863dc8] .cl_env_get+0x88/0x230 [obdclass][c000000c1d9efc60] [d00000000bdadd48] .ldlm_cli_cancel_local+0xd8/0x7d0 [ptlrpc]
2013-05-17 10:35:01 
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000ddb6d7a50] [d00000000b863ffc] .cl_env_nested_get+0x8c/0xf0 [obdclass]
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000c1d9efd20] [d00000000bdaf9dc] .ldlm_cli_cancel_list_local+0x14c/0x410 [ptlrpc][c000000ddb6d7ae0] [d00000000c0bcaf8] .osc_ldlm_blocking_ast+0x58/0x4f0 [osc]
2013-05-17 10:35:01 Mem-Info:
2013-05-17 10:35:01 [c000000dd2903c60] [d00000000bdadd48] .ldlm_cli_cancel_local+0xd8/0x7d0 [ptlrpc][c000000f4a17f5f0] [d00000000aa72588] .cfs_mem_cache_alloc+0x28/0x40 [libcfs]
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000c1d9efe40] [d00000000bdbc24c] .ldlm_bl_thread_main+0x19c/0x610 [ptlrpc]
2013-05-17 10:35:01 
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000c1d9eff90] [c000000000032fd4] .kernel_thread+0x54/0x70[c000000f4e87fae0] [d00000000c0bcaf8] .osc_ldlm_blocking_ast+0x58/0x4f0 [osc]
2013-05-17 10:35:01 Node 0 DMA per-cpu:
2013-05-17 10:35:01 
2013-05-17 10:35:01 [c000000ddb6d7bc0] [d00000000bd9148c] .ldlm_cancel_callback+0x9c/0x310 [ptlrpc][c000000f4a17f670] [d00000000c0b2cec] .osc_session_init+0x5c/0x350 [osc]
2013-05-17 10:35:01 [c000000f4e87fbc0] [d00000000bd9148c] .ldlm_cancel_callback+0x9c/0x310 [ptlrpc]CPU    0: hi:    6, btch:   1 usd:   4
2013-05-17 10:35:01 CPU    1: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 CPU    2: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 CPU    3: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 CPU    4: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 CPU    5: hi:    6, btch:   1 usd:   3
2013-05-17 10:35:01 CPU    6: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 CPU    7: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 CPU    8: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 Mem-Info:
2013-05-17 10:35:01 
2013-05-17 10:35:01 Node 0 
2013-05-17 10:35:01 [c000000dd2903d20] [d00000000bdaf9dc] .ldlm_cli_cancel_list_local+0x14c/0x410 [ptlrpc]CPU    9: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 CPU   10: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 [c000000f4e87fc60] [d00000000bdadd48] .ldlm_cli_cancel_local+0xd8/0x7d0 [ptlrpc]DMA per-cpu:
2013-05-17 10:35:01 [c000000ddb6d7c60] [d00000000bdadd48] .ldlm_cli_cancel_local+0xd8/0x7d0 [ptlrpc]CPU    0: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 
2013-05-17 10:35:01 CPU    1: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 [c000000ddb6d7d20] [d00000000bdaf9dc] .ldlm_cli_cancel_list_local+0x14c/0x410 [ptlrpc]CPU    2: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 
2013-05-17 10:35:01 CPU    3: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 [c000000ddb6d7e40] [d00000000bdbc24c] .ldlm_bl_thread_main+0x19c/0x610 [ptlrpc]CPU    4: hi:    6, btch:   1 usd:   5
2013-05-17 10:35:01 
2013-05-17 10:35:01 CPU    5: hi:    6, btch:   1 usd:   3
2013-05-17 10:35:01 [c000000ddb6d7f90] [c000000000032fd4] .kernel_thread+0x54/0x70CPU    6: hi:    6, btch:   1 usd:   5
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="63218" author="laisiyao" created="Tue, 30 Jul 2013 10:20:43 +0000"  >&lt;p&gt;IMO it&apos;s best to prealloc env for ldlm blocking threads, I&apos;ll cook a patch later.&lt;/p&gt;</comment>
                            <comment id="63352" author="laisiyao" created="Wed, 31 Jul 2013 05:20:52 +0000"  >&lt;p&gt;Patch is on &lt;a href=&quot;http://review.whamcloud.com/#/c/7189/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/7189/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="63451" author="laisiyao" created="Thu, 1 Aug 2013 09:38:33 +0000"  >&lt;p&gt;This looks to be more complicated than original thought: osc_ldlm_blocking_ast can be called under several different context:&lt;br/&gt;
1. cancel lock in ldlm_callback_handler, because it failed to handle blocking AST to ldlm bl threads&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;    ptlrpc_main -&amp;gt;
        ldlm_callback_handler -&amp;gt;
            l_blocking_ast
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;2. cancel lock in context of ldlm bl threads. This is the case happened in this ticket.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;    ldlm_bl_thread_main -&amp;gt;
        l_blocking_ast
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;3. cancel lock voluntarily, this is simple and fine&lt;br/&gt;
4. same as 3, but it may be called in the context of outer IO&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;    cl_enqueue -&amp;gt;
        osc_enqueue_base -&amp;gt;
            ldlm_prep_elc_req -&amp;gt;
                l_blocking_ast
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The original patch only fixes case 2, and there doesn&apos;t look to be simple way to fix case 1 and 4. so we may have to get rid of the LBUG for env allocation failure, and return error anyway.&lt;/p&gt;</comment>
                            <comment id="63695" author="laisiyao" created="Tue, 6 Aug 2013 05:16:10 +0000"  >&lt;p&gt;On second thought, for case 4 this is an extra ELC (ELC count is too big to pack in request, and the remaining locks should be canceled in a separate request), failure to do it won&apos;t cause any trouble.&lt;/p&gt;

&lt;p&gt;And for case 1, generally the blocking ast is handled to ldlm bl threads, and in rare case it will fail (men allocation failure), and upon its failure, we don&apos;t need to guarantee blocking ast success, and system will recover to normal state.&lt;/p&gt;

&lt;p&gt;So in the latest patch we just let blocking ast fail upon env get failure instead of LBUG.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10490" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>End date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 26 Jun 2014 00:34:39 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvwc7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9393</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10493" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>Start date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 29 Jul 2013 00:34:39 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    </customfields>
    </item>
</channel>
</rss>