<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:22:56 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2167] Race in ptlrpcd_stop/fini vs ptlrpcd leaving to use after free</title>
                <link>https://jira.whamcloud.com/browse/LU-2167</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I met this oops on my testing system several times and thought it did not make any sense initially, but after looking into the code I see that it does make sense after all.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[11898.846141] BUG: unable to handle kernel paging request at ffff88008fb39800
[11898.846670] IP: [&amp;lt;ffffffffa1368c93&amp;gt;] ptlrpcd+0x2c3/0x3a0 [ptlrpc]
[11898.847291] PGD 1a26063 PUD 501067 PMD 57f067 PTE 800000008fb39160
[11898.847789] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[11898.848237] last sysfs file: /sys/devices/system/cpu/possible
[11898.848678] CPU 2 
[11898.848743] Modules linked in: ptlrpc(-) obdclass lvfs ksocklnd lnet libcfs exportfs jbd sha512_generic sha256_generic ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ext4 mbcache jbd2 virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache nfs_acl auth_rpcgss sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: fld]
[11898.850099] 
[11898.850099] Pid: 30737, comm: ptlrpcd_rcv Not tainted 2.6.32-debug #6 Bochs Bochs
[11898.850099] RIP: 0010:[&amp;lt;ffffffffa1368c93&amp;gt;]  [&amp;lt;ffffffffa1368c93&amp;gt;] ptlrpcd+0x2c3/0x3a0 [ptlrpc]
[11898.850099] RSP: 0018:ffff880088529e70  EFLAGS: 00010286
[11898.850099] RAX: 0000000000000002 RBX: ffff88008fb39800 RCX: 0000000000000001
[11898.850099] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000282
[11898.850099] RBP: ffff880088529f40 R08: 0000000000000000 R09: 0000000000000001
[11898.850099] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880088529ea0
[11898.850099] R13: ffff880083785f30 R14: 0000000000000002 R15: ffff880088529ee0
[11898.850099] FS:  00007f449616c700(0000) GS:ffff880006300000(0000) knlGS:0000000000000000
[11898.850099] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[11898.850099] CR2: ffff88008fb39800 CR3: 00000000b3178000 CR4: 00000000000006e0
[11898.850099] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[11898.850099] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[11898.850099] Process ptlrpcd_rcv (pid: 30737, threadinfo ffff880088528000, task ffff880070208500)
[11898.850099] Stack:
[11898.850099]  ffff880070208500 ffffffffffffffff ffff880083785f40 0000000000000000
[11898.850099] &amp;lt;d&amp;gt; ffff880070208500 0000000000000000 00000004b0000008 0000000000000000
[11898.850099] &amp;lt;d&amp;gt; 0000000000000000 ffff880088529eb8 ffff880088529eb8 0000000000000054
[11898.850099] Call Trace:
[11898.850099]  [&amp;lt;ffffffff81057d60&amp;gt;] ? default_wake_function+0x0/0x20
[11898.850099]  [&amp;lt;ffffffffa13689d0&amp;gt;] ? ptlrpcd+0x0/0x3a0 [ptlrpc]
[11898.850099]  [&amp;lt;ffffffff8100c14a&amp;gt;] child_rip+0xa/0x20
[11898.850099]  [&amp;lt;ffffffffa13689d0&amp;gt;] ? ptlrpcd+0x0/0x3a0 [ptlrpc]
[11898.850099]  [&amp;lt;ffffffffa13689d0&amp;gt;] ? ptlrpcd+0x0/0x3a0 [ptlrpc]
[11898.850099]  [&amp;lt;ffffffff8100c140&amp;gt;] ? child_rip+0x0/0x20
[11898.850099] Code: e9 ba fe ff ff 0f 1f 00 49 8d 45 40 49 39 45 40 74 08 4c 89 ef e8 8e f6 fc ff 4c 89 e7 e8 46 1f ea ff 48 8d 7b 50 e8 8d 91 ce df &amp;lt;f0&amp;gt; 80 23 fd f0 80 23 fb f0 80 23 ef f0 80 63 02 fe e9 17 fe ff 
[11898.850099] RIP  [&amp;lt;ffffffffa1368c93&amp;gt;] ptlrpcd+0x2c3/0x3a0 [ptlrpc]
[11898.850099]  RSP &amp;lt;ffff880088529e70&amp;gt;
[11898.850099] CR2: ffff88008fb39800
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Looking at the code, in ptlrpcd() we see:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (!cfs_list_empty(&amp;amp;set-&amp;gt;set_requests))
                ptlrpc_set_wait(set);
        lu_context_fini(&amp;amp;env.le_ctx);
        cfs_complete(&amp;amp;pc-&amp;gt;pc_finishing);

        cfs_clear_bit(LIOD_START, &amp;amp;pc-&amp;gt;pc_flags);  &amp;lt;=== Crash Here
        cfs_clear_bit(LIOD_STOP, &amp;amp;pc-&amp;gt;pc_flags);
        cfs_clear_bit(LIOD_FORCE, &amp;amp;pc-&amp;gt;pc_flags);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And in ptlrpcd_stop/fini():&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;{
...
#ifdef __KERNEL__
        cfs_wait_for_completion(&amp;amp;pc-&amp;gt;pc_finishing);
#&lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt;
        liblustre_deregister_wait_callback(pc-&amp;gt;pc_wait_callback);
        liblustre_deregister_idle_callback(pc-&amp;gt;pc_idle_callback);
#endif
        lu_context_fini(&amp;amp;pc-&amp;gt;pc_env.le_ctx);
...
        EXIT;
}
&lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; void ptlrpcd_fini(void)
{
        &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; i;
        ENTRY;

        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (ptlrpcds != NULL) {
                &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; (i = 0; i &amp;lt; ptlrpcds-&amp;gt;pd_nthreads; i++)
                        ptlrpcd_stop(&amp;amp;ptlrpcds-&amp;gt;pd_threads[i], 0);
                ptlrpcd_stop(&amp;amp;ptlrpcds-&amp;gt;pd_thread_rcv, 0);
lost race =&amp;gt;&amp;gt;   OBD_FREE(ptlrpcds, ptlrpcds-&amp;gt;pd_size);
                ptlrpcds = NULL;
        }
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I think the proper solution to avoid the race should be to move cfs_clear_bit()s to before signalling the completion.&lt;/p&gt;</description>
                <environment></environment>
        <key id="16349">LU-2167</key>
            <summary>Race in ptlrpcd_stop/fini vs ptlrpcd leaving to use after free</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="green">Oleg Drokin</reporter>
                        <labels>
                    </labels>
                <created>Sat, 13 Oct 2012 12:46:14 +0000</created>
                <updated>Fri, 19 Apr 2013 20:26:19 +0000</updated>
                            <resolved>Tue, 23 Oct 2012 19:12:56 +0000</resolved>
                                    <version>Lustre 2.4.0</version>
                                    <fixVersion>Lustre 2.4.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                            <comments>
                            <comment id="46511" author="green" created="Sat, 13 Oct 2012 12:53:26 +0000"  >&lt;p&gt;Suggested patch in &lt;a href=&quot;http://review.whamcloud.com/4264&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/4264&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="46866" author="green" created="Tue, 23 Oct 2012 19:12:56 +0000"  >&lt;p&gt;Patch landed, resolving&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzva9j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5198</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>