<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:44:39 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4650] contention on ll_inode_size_lock with mmap&apos;ed file</title>
                <link>https://jira.whamcloud.com/browse/LU-4650</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Our customer (CEA) is suffering huge contention when using a debugging tool (Distributed Debugging Tool) with a binary file located on Lustre filesystem. The binary file is quite large (~300 MB). The debugging tool launches one gdb instance per core on the client, which reveals high contention on large SMP nodes (32 cores).&lt;/p&gt;

&lt;p&gt;Global launch time appears to be 3 minutes when binary file is on Lustre compared to 20 seconds only on NFS.&lt;/p&gt;



&lt;p&gt;After analysis of the operations done by gdb, I have created a test program that reproduces the issue (mmaptest.c). It does:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;open a file O_RDONLY&lt;/li&gt;
	&lt;li&gt;mmap it entirely PROT_READ, MAP_PRIVATE&lt;/li&gt;
	&lt;li&gt;access each page of the memory region (from first to last page)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Launch command is:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;\cp file1G /dev/null; time ./launch_mmaptest.sh 16 file1G&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;Run time with lustre 2.1.6&lt;/p&gt;
&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;&amp;nbsp;&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt; ext4 &lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt; lustre &lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 1 instance &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 0.339s &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 2.951s &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 32 instances &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 0.558s &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 9m20.669s &lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;Run time with lustre 2.4.2&lt;/p&gt;
&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;&amp;nbsp;&lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt; ext4 &lt;/th&gt;
&lt;th class=&apos;confluenceTh&apos;&gt; lustre &lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 1 instance &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 0.349s &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 6.542s &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 16 instances &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 0.373s &lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 45.588s &lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;



&lt;p&gt;With several instances, processes are waiting on inode size lock. Here is the stack of most of the instances during the test&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[&amp;lt;ffffffff810a0371&amp;gt;] down+0x41/0x50
[&amp;lt;ffffffffa0b5daa2&amp;gt;] ll_inode_size_lock+0x52/0x110 [lustre]
[&amp;lt;ffffffffa0b97a06&amp;gt;] ccc_prep_size+0x86/0x270 [lustre]
[&amp;lt;ffffffffa0b9f4a1&amp;gt;] vvp_io_fault_start+0xf1/0xb00 [lustre]
[&amp;lt;ffffffffa060061a&amp;gt;] cl_io_start+0x6a/0x140 [obdclass]
[&amp;lt;ffffffffa0604d54&amp;gt;] cl_io_loop+0xb4/0x1b0 [obdclass]
[&amp;lt;ffffffffa0b827a2&amp;gt;] ll_fault+0x2c2/0x4d0 [lustre]
[&amp;lt;ffffffff8114a4c4&amp;gt;] __do_fault+0x54/0x540
[&amp;lt;ffffffff8114aa4d&amp;gt;] handle_pte_fault+0x9d/0xbd0
[&amp;lt;ffffffff8114b7aa&amp;gt;] handle_mm_fault+0x22a/0x300
[&amp;lt;ffffffff8104aa68&amp;gt;] __do_page_fault+0x138/0x480
[&amp;lt;ffffffff8152e2fe&amp;gt;] do_page_fault+0x3e/0xa0
[&amp;lt;ffffffff8152b6b5&amp;gt;] page_fault+0x25/0x30
[&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
</description>
                <environment>rhel 6.4&lt;br/&gt;
kernel 2.6.32-431</environment>
        <key id="23208">LU-4650</key>
            <summary>contention on ll_inode_size_lock with mmap&apos;ed file</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="dmiter">Dmitry Eremin</assignee>
                                    <reporter username="pichong">Gregoire Pichon</reporter>
                        <labels>
                    </labels>
                <created>Wed, 19 Feb 2014 09:45:53 +0000</created>
                <updated>Thu, 19 May 2016 18:35:58 +0000</updated>
                            <resolved>Thu, 19 May 2016 18:35:58 +0000</resolved>
                                    <version>Lustre 2.1.6</version>
                    <version>Lustre 2.4.2</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="77337" author="dmiter" created="Wed, 19 Feb 2014 10:49:40 +0000"  >&lt;p&gt;The root cause the same as in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4257&quot; title=&quot;parallel dds are slower than serial dds&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4257&quot;&gt;&lt;del&gt;LU-4257&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="77339" author="dmiter" created="Wed, 19 Feb 2014 10:51:52 +0000"  >&lt;p&gt;Patch &lt;a href=&quot;http://review.whamcloud.com/9095/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/9095/&lt;/a&gt; should help with this.&lt;/p&gt;</comment>
                            <comment id="77348" author="pichong" created="Wed, 19 Feb 2014 13:25:41 +0000"  >&lt;p&gt;Thanks. The patch might improve performance because it improves lock management. But I think there is still a design/implementation issue.&lt;/p&gt;

&lt;p&gt;Why inode size lock need to be taken, since file size does not change and accesses are read only (file is open with O_RDONLY, mmap is done with PROT_READ) ?&lt;/p&gt;</comment>
                            <comment id="77352" author="dmiter" created="Wed, 19 Feb 2014 13:57:32 +0000"  >&lt;p&gt;This patch is temporary solution that should improve situation right now. We are working on redesign of this code and avoid this lock at all. The results are promising. But the patch will be late.&lt;/p&gt;</comment>
                            <comment id="77465" author="pichong" created="Thu, 20 Feb 2014 12:02:55 +0000"  >&lt;p&gt;Could you explain what the new design does ? Is there a HLD document available ?&lt;/p&gt;</comment>
                            <comment id="77599" author="dmiter" created="Fri, 21 Feb 2014 15:22:00 +0000"  >&lt;p&gt;Jinshan,&lt;br/&gt;
Could you answer this question please?&lt;/p&gt;</comment>
                            <comment id="152877" author="jay" created="Thu, 19 May 2016 18:35:58 +0000"  >&lt;p&gt;duplication of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4257&quot; title=&quot;parallel dds are slower than serial dds&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4257&quot;&gt;&lt;del&gt;LU-4257&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="22045">LU-4257</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="14138" name="launch_mmaptest.sh" size="190" author="pichong" created="Wed, 19 Feb 2014 09:45:53 +0000"/>
                            <attachment id="14137" name="mmaptest.c" size="954" author="pichong" created="Wed, 19 Feb 2014 09:45:53 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10040" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic</customfieldname>
                        <customfieldvalues>
                                        <label>contention</label>
            <label>mmap</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwfdr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>12719</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>