<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:54:11 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-5749] osd-zfs: object creation may serialize on lu_site::ls_purge_mutex</title>
                <link>https://jira.whamcloud.com/browse/LU-5749</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5331&quot; title=&quot;qsd_handler.c:1139:qsd_op_adjust()) ASSERTION( qqi ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5331&quot;&gt;&lt;del&gt;LU-5331&lt;/del&gt;&lt;/a&gt; introduced lu_site::ls_purge_mutex to serialize lu_site_purge(). But in osd-zfs,  when every new object is created, lu_object_limit() is called which calls lu_site_purge() if the cache is too big.&lt;/p&gt;

&lt;p&gt;Contention on the mutex can happen when multiple threads are creating objects and the cache is near the lu_cache_nr limit. In &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5747&quot; title=&quot;NULL pointer dereference in task_rq_lock when running mds-survey&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5747&quot;&gt;&lt;del&gt;LU-5747&lt;/del&gt;&lt;/a&gt; I saw stacks like:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt; [&amp;lt;ffffffff8106306c&amp;gt;] try_to_wake_up+0x3c/0x3e0
 [&amp;lt;ffffffffa0f0e219&amp;gt;] ? echo_object_free+0x159/0x2f0 [obdecho]
 [&amp;lt;ffffffff81063465&amp;gt;] wake_up_process+0x15/0x20
 [&amp;lt;ffffffff8150f7e4&amp;gt;] __mutex_unlock_slowpath+0x44/0x60
 [&amp;lt;ffffffff8150f79b&amp;gt;] mutex_unlock+0x1b/0x20
 [&amp;lt;ffffffffa07a4907&amp;gt;] lu_site_purge+0x3f7/0x4e0 [obdclass]
 [&amp;lt;ffffffffa07a4e31&amp;gt;] lu_object_limit+0x71/0x80 [obdclass]
 [&amp;lt;ffffffffa07a4f93&amp;gt;] lu_object_find_try+0x153/0x2b0 [obdclass]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Which indicated contention on the mutex. So this may hurt object creation rates on osd-zfs. But I don&apos;t have any data to support it yet, due to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5747&quot; title=&quot;NULL pointer dereference in task_rq_lock when running mds-survey&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5747&quot;&gt;&lt;del&gt;LU-5747&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</description>
                <environment></environment>
        <key id="27038">LU-5749</key>
            <summary>osd-zfs: object creation may serialize on lu_site::ls_purge_mutex</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="bzzz">Alex Zhuravlev</assignee>
                                    <reporter username="isaac">Isaac Huang</reporter>
                        <labels>
                            <label>RZ_LS</label>
                            <label>zfs</label>
                    </labels>
                <created>Thu, 16 Oct 2014 03:11:13 +0000</created>
                <updated>Wed, 13 Feb 2019 07:39:24 +0000</updated>
                            <resolved>Wed, 13 Feb 2019 07:39:24 +0000</resolved>
                                    <version>Lustre 2.7.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="96609" author="adilger" created="Fri, 17 Oct 2014 17:56:35 +0000"  >&lt;p&gt;It probably makes sense for lu_site_purge() to use &lt;tt&gt;mutex_trylock()&lt;/tt&gt; and just return immediately if &lt;tt&gt;ls_purge_mutex&lt;/tt&gt; is held and another thread is dropping the cache (need a static variable that is updated by the thread holding ls_purge_mutex indicating if it is doing a full purge or not).  There is no reason for other threads to be blocked if one is already dropping the entire cache.  There is also no reason for threads to block when doing a limited cache shrink if another thread is also doing a limited shrink.&lt;/p&gt;</comment>
                            <comment id="110705" author="yong.fan" created="Thu, 26 Mar 2015 02:53:16 +0000"  >&lt;p&gt;I hit it on master:&lt;br/&gt;
&lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/a6c0f402-d2ed-11e4-a357-5254006e85c2&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/a6c0f402-d2ed-11e4-a357-5254006e85c2&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="241840" author="adilger" created="Wed, 13 Feb 2019 07:39:24 +0000"  >&lt;p&gt;Fixed via patch &lt;a href=&quot;http://review.whamcloud.com/19082&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/19082&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7896&quot; title=&quot;lu_object_limit() is called too frequently&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7896&quot;&gt;&lt;del&gt;LU-7896&lt;/del&gt;&lt;/a&gt;: do not call lu_site_purge() for single object exceed&lt;/tt&gt;&quot;.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="25538">LU-5331</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="35507">LU-7896</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwyo7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>16144</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>