<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:15:59 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1363] SELinux and stateahead hang</title>
                <link>https://jira.whamcloud.com/browse/LU-1363</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;On a node with SELinux turned on, stateahead may deadlock and will quickly drive to a hang.&lt;/p&gt;

&lt;p&gt;Here some traces : a user issue a du that will start stateahead&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;crash&amp;gt; bt 60718
PID: 60718  TASK: ffff8817e6b337d0  CPU: 3   COMMAND: &quot;du&quot;
 #0 [ffff8817df1db938] schedule at ffffffff8147dddc
 #1 [ffff8817df1dba00] cfs_waitq_wait at ffffffffa03ed75e [libcfs]
 #2 [ffff8817df1dba10] do_statahead_enter at ffffffffa0877794 [lustre]
 #3 [ffff8817df1dbb20] ll_lookup_it at ffffffffa0862c75 [lustre]
 #4 [ffff8817df1dbc30] ll_lookup_nd at ffffffffa0862e58 [lustre]
 #5 [ffff8817df1dbc90] do_lookup at ffffffff8116bb0b
 #6 [ffff8817df1dbcf0] __link_path_walk at ffffffff8116c109
 #7 [ffff8817df1dbd70] path_walk at ffffffff8116ca8a
 #8 [ffff8817df1dbdb0] do_path_lookup at ffffffff8116cc5b
 #9 [ffff8817df1dbde0] user_path_at at ffffffff8116cde7
#10 [ffff8817df1dbeb0] vfs_fstatat at ffffffff811630ac
#11 [ffff8817df1dbef0] sys_newfstatat at ffffffff81163114
#12 [ffff8817df1dbf80] system_call_fastpath at ffffffff81003172
    RIP: 0000003b564d7eda  RSP: 00007fff49f48258  RFLAGS: 00010206
    RAX: 0000000000000106  RBX: ffffffff81003172  RCX: 0000000000000000
    RDX: 00000000015d12b0  RSI: 00000000015d1340  RDI: 0000000000000005
    RBP: 00000000015d12b0   R8: 0000000000000100   R9: 0000000000000000
    R10: 0000000000000100  R11: 0000000000000246  R12: 00000000015c6930
    R13: 00000000015d1240  R14: 0000000000000000  R15: 00000000015c6930
    ORIG_RAX: 0000000000000106  CS: 0033  SS: 002b

PID: 60781  TASK: ffff88082f71b7d0  CPU: 7   COMMAND: &quot;ll_sa_60718&quot;
 #0 [ffff88082f713d10] schedule at ffffffff8147dddc
 #1 [ffff88082f713dd8] cfs_waitq_wait at ffffffffa03ed75e [libcfs]
 #2 [ffff88082f713de8] ll_statahead_thread at ffffffffa087b52b [lustre]
 #3 [ffff88082f713f48] kernel_thread at ffffffff810041aa

PID: 60802  TASK: ffff8807a2576040  CPU: 23  COMMAND: &quot;ll_sa_60718&quot;
 #0 [ffff8807f89ff580] schedule at ffffffff8147dddc
 #1 [ffff8807f89ff648] schedule_timeout at ffffffff8147eab0
 #2 [ffff8807f89ff6f8] cfs_waitq_timedwait at ffffffffa03ed731 [libcfs]
 #3 [ffff8807f89ff708] ptlrpc_set_wait at ffffffffa05c3788 [ptlrpc]
 #4 [ffff8807f89ff7e8] ptlrpc_queue_wait at ffffffffa05c3ce6 [ptlrpc]
 #5 [ffff8807f89ff838] mdc_xattr_common at ffffffffa074795c [mdc]
 #6 [ffff8807f89ff8d8] mdc_getxattr at ffffffffa0747ea7 [mdc]
 #7 [ffff8807f89ff928] lmv_getxattr at ffffffffa08e3a1b [lmv]
 #8 [ffff8807f89ff9b8] ll_getxattr_common at ffffffffa086a956 [lustre]
 #9 [ffff8807f89ffad8] ll_getxattr at ffffffffa086bd62 [lustre]
#10 [ffff8807f89ffb88] inode_doinit_with_dentry at ffffffff8120b2e0
#11 [ffff8807f89ffbf8] selinux_d_instantiate at ffffffff8120b83c
#12 [ffff8807f89ffc08] security_d_instantiate at ffffffff811fe57b
#13 [ffff8807f89ffc18] ll_d_add at ffffffffa0860a15 [lustre]
#14 [ffff8807f89ffc78] ll_lookup_it_finish at ffffffffa0861dff [lustre]
#15 [ffff8807f89ffd38] do_statahead_interpret at ffffffffa08771d2 [lustre]
#16 [ffff8807f89ffde8] ll_statahead_thread at ffffffffa087b2d2 [lustre]
#17 [ffff8807f89fff48] kernel_thread at ffffffff810041aa
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt; 

&lt;p&gt;Note that stateahead is now awaiting for an RPC to complete and did schedule on CPU 23.&lt;br/&gt;
My understanding is that, at this point, it held dcache_lock that has been acquired in ll_lookup_it_finish, just before calling ll_d_add&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;                spin_lock(&amp;amp;dcache_lock);
                ll_d_add(*de, NULL);
                spin_unlock(&amp;amp;dcache_lock);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt; 

&lt;p&gt;I haven&apos;t found a place where this lock might have been released.&lt;/p&gt;

&lt;p&gt;During this time, a new thread is scheduled on the same cpu and start spinlocking on ll_lookup_lock (which is just acquired before dcache_lock in many code path)&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;crash&amp;gt; bt 49060
PID: 49060  TASK: ffff88080736d100  CPU: 23  COMMAND: &quot;ldlm_bl_70&quot;
 #0 [ffff88189c547e90] crash_nmi_callback at ffffffff8101fd06
 #1 [ffff88189c547ea0] notifier_call_chain at ffffffff814837f5
 #2 [ffff88189c547ee0] atomic_notifier_call_chain at ffffffff8148385a
 #3 [ffff88189c547ef0] notify_die at ffffffff8108026e
 #4 [ffff88189c547f20] do_nmi at ffffffff81481443
 #5 [ffff88189c547f50] nmi at ffffffff81480d50
    [exception RIP: _spin_lock+33]
    RIP: ffffffff81480631  RSP: ffff8806e185bc00  RFLAGS: 00000293
    RAX: 00000000000041ce  RBX: ffff8818205c6238  RCX: 0000000000000000
    RDX: 00000000000041c0  RSI: 0000000000000000  RDI: ffffffffa089d100
    RBP: ffff8806e185bc00   R8: 000000020817392b   R9: 000000000000037e
    R10: df80000000000000  R11: a5236b391d08037e  R12: ffff88185e51d240
    R13: 0000000000000003  R14: ffff8818205c6100  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- &amp;lt;NMI exception stack&amp;gt; ---
 #6 [ffff8806e185bc00] _spin_lock at ffffffff81480631
 #7 [ffff8806e185bc08] ll_md_blocking_ast at ffffffffa086049f [lustre]
 #8 [ffff8806e185bc98] ldlm_cancel_callback at ffffffffa0585d33 [ptlrpc]
 #9 [ffff8806e185bcb8] ldlm_cli_cancel_local at ffffffffa059feb1 [ptlrpc]
#10 [ffff8806e185bd18] ldlm_cli_cancel at ffffffffa05a42e8 [ptlrpc]
#11 [ffff8806e185bd98] ll_md_blocking_ast at ffffffffa0860085 [lustre]
#12 [ffff8806e185be28] ldlm_handle_bl_callback at ffffffffa05a76ed [ptlrpc]
#13 [ffff8806e185be88] ldlm_bl_thread_main at ffffffffa05a7b41 [ptlrpc]
#14 [ffff8806e185bf48] kernel_thread at ffffffff810041aa
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt; 

&lt;p&gt;Quickly the system will hang with all threads spinlocking&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;crash&amp;gt; ps | egrep &quot;^&amp;gt;&quot;
&amp;gt;  9226      1  31  ffff88187beaf080  RU   0.0  199500   5332  snmpd
&amp;gt;  9286      1  15  ffff88087b050040  RU   0.0   40948   1308  nrpe
&amp;gt; 36612      1  10  ffff88187b8ad850  RU   0.0   64940   1096  sshd
&amp;gt; 49038      2   4  ffff88187b783100  RU   0.0       0      0  [ldlm_bl_48]
&amp;gt; 49039      2  13  ffff88070cea6080  RU   0.0       0      0  [ldlm_bl_49]
&amp;gt; 49042      2   3  ffff880a70d5f810  RU   0.0       0      0  [ldlm_bl_52]
&amp;gt; 49043      2  17  ffff8820612d2040  RU   0.0       0      0  [ldlm_bl_53]
&amp;gt; 49044      2   1  ffff881861201810  RU   0.0       0      0  [ldlm_bl_54]
&amp;gt; 49045      2  11  ffff880ac92cc040  RU   0.0       0      0  [ldlm_bl_55]
&amp;gt; 49046      2  21  ffff881f4cf670c0  RU   0.0       0      0  [ldlm_bl_56]
&amp;gt; 49047      2   9  ffff88074ef5d810  RU   0.0       0      0  [ldlm_bl_57]
&amp;gt; 49048      2  25  ffff880a55860850  RU   0.0       0      0  [ldlm_bl_58]
&amp;gt; 49049      2   0  ffff8818225f1850  RU   0.0       0      0  [ldlm_bl_59]
&amp;gt; 49050      2  29  ffff88074ef5d0c0  RU   0.0       0      0  [ldlm_bl_60]
&amp;gt; 49051      2  27  ffff880a6e2a0850  RU   0.0       0      0  [ldlm_bl_61]
&amp;gt; 49052      2   2  ffff881747c447d0  RU   0.0       0      0  [ldlm_bl_62]
&amp;gt; 49053      2   5  ffff8817e68fb100  RU   0.0       0      0  [ldlm_bl_63]
&amp;gt; 49054      2   7  ffff880a6f8e47d0  RU   0.0       0      0  [ldlm_bl_64]
&amp;gt; 49055      2  12  ffff881f3bf72080  RU   0.0       0      0  [ldlm_bl_65]
&amp;gt; 49056      2   8  ffff88080736d850  RU   0.0       0      0  [ldlm_bl_66]
&amp;gt; 49057      2  30  ffff880fb0e447d0  RU   0.0       0      0  [ldlm_bl_67]
&amp;gt; 49058      2  26  ffff8817e8dcb040  RU   0.0       0      0  [ldlm_bl_68]
&amp;gt; 49059      2  14  ffff8817d831b790  RU   0.0       0      0  [ldlm_bl_69]
&amp;gt; 49060      2  23  ffff88080736d100  RU   0.0       0      0  [ldlm_bl_70]
&amp;gt; 49061      2  22  ffff881037704100  RU   0.0       0      0  [ldlm_bl_71]
&amp;gt; 49062      2  19  ffff8817d79687d0  RU   0.0       0      0  [ldlm_bl_72]
&amp;gt; 49066      2  20  ffff88074db99040  RU   0.0       0      0  [ldlm_bl_76]
&amp;gt; 49067      2  28  ffff88103779a0c0  RU   0.0       0      0  [ldlm_bl_77]
&amp;gt; 49068      2  24  ffff8817de5b5810  RU   0.0       0      0  [ldlm_bl_78]
&amp;gt; 60897  60840   6  ffff8807423e5790  RU   0.0    2412    148  expr
&amp;gt; 77040      2  16  ffff881051753810  RU   0.0       0      0  [ldlm_bl_03]
&amp;gt; 77139      2  18  ffff88207bf2e850  RU   0.0       0      0  [ldlm_bl_21]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Thanks&lt;/p&gt;</description>
                <environment></environment>
        <key id="14243">LU-1363</key>
            <summary>SELinux and stateahead hang</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="louveta">Alexandre Louvet</reporter>
                        <labels>
                            <label>paj</label>
                    </labels>
                <created>Thu, 3 May 2012 02:17:19 +0000</created>
                <updated>Sat, 22 Dec 2012 10:33:44 +0000</updated>
                            <resolved>Mon, 4 Jun 2012 05:07:49 +0000</resolved>
                                    <version>Lustre 2.1.1</version>
                                    <fixVersion>Lustre 2.3.0</fixVersion>
                    <fixVersion>Lustre 2.1.4</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="38079" author="pjones" created="Thu, 3 May 2012 08:07:37 +0000"  >&lt;p&gt;Bobijam&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="38089" author="bobijam" created="Thu, 3 May 2012 11:30:08 +0000"  >&lt;p&gt;security_d_instantiate() could allocate memory and/or yield cpu, should not held spin locks. &lt;/p&gt;

&lt;p&gt;patch tracking at &lt;a href=&quot;http://review.whamcloud.com/2641&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/2641&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="39908" author="pjones" created="Mon, 4 Jun 2012 05:07:49 +0000"  >&lt;p&gt;Landed for 2.3&lt;/p&gt;</comment>
                            <comment id="47698" author="bogl" created="Mon, 12 Nov 2012 15:21:17 +0000"  >&lt;p&gt;back port to b2_1: &lt;a href=&quot;http://review.whamcloud.com/4517&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/4517&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzv6mf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4607</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>