<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:42:00 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-11221] Do not hold pages locked for network IO on the server.</title>
                <link>https://jira.whamcloud.com/browse/LU-11221</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Investigating some customer issues I noticed that currently our read looks roughly like this:&lt;br/&gt;
1. obtain the pages and lock them&lt;br/&gt;
2. prepare and then execute our bulk request&lt;br/&gt;
3. unlock the pages.&lt;/p&gt;

&lt;p&gt;essentially holding pages locked for network IO. This seems to be not super optimal since parallel reads cannot proceed and must wait for each other read to complete first. This would also help in case of client death/network problems since any hung bulk RPCs would not have as much impact on parallel operations.&lt;/p&gt;

&lt;p&gt;We already have dlm locks to protect us so we probably should be fine to drop the locked pages at step2 and possibly even for writes as well (need to investigate eviction implications) - we were already operating in this mode in 18. and prior before read only cache on the server was implemented and each RPC had a private pool of pages not connected to any inode mappings.&lt;/p&gt;</description>
                <environment></environment>
        <key id="52913">LU-11221</key>
            <summary>Do not hold pages locked for network IO on the server.</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bzzz">Alex Zhuravlev</assignee>
                                    <reporter username="green">Oleg Drokin</reporter>
                        <labels>
                    </labels>
                <created>Mon, 6 Aug 2018 18:25:52 +0000</created>
                <updated>Thu, 11 Jun 2020 09:23:11 +0000</updated>
                            <resolved>Fri, 9 Aug 2019 15:06:01 +0000</resolved>
                                                    <fixVersion>Lustre 2.13.0</fixVersion>
                    <fixVersion>Lustre 2.12.4</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>14</watches>
                                                                            <comments>
                            <comment id="231596" author="pjones" created="Tue, 7 Aug 2018 17:59:51 +0000"  >&lt;p&gt;Alex&lt;/p&gt;

&lt;p&gt;Could you please investigate?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="236091" author="adilger" created="Wed, 31 Oct 2018 23:37:52 +0000"  >&lt;p&gt;&lt;a href=&quot;https://review.whamcloud.com/33521&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33521&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="245343" author="sihara" created="Sun, 7 Apr 2019 14:45:54 +0000"  >&lt;p&gt;I&apos;m not sure patch &lt;a href=&quot;https://review.whamcloud.com/33521&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33521&lt;/a&gt; can behave using page caches if file size &amp;lt; readcache_max_filesize, but no page caches after readcache_max_filesize.&lt;/p&gt;

&lt;p&gt;At least, I have been seeing very good merged IOs with &quot;read_cache_enable=0 writethrough_cache_enable=0&quot;, but &quot;read_cache_enable=1 writethrough_cache_enable=1 and readcache_max_filesize=1048576&quot; seems to be less chance to make merged IOs and less performance.&lt;/p&gt;

&lt;p&gt;Please see the following test results.&lt;/p&gt;

&lt;p&gt;Disable Pagecache mode&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@es18k-vm11 ~]# lctl set_param osd-ldiskfs.*.read_cache_enable=0 obdfilter.*.writethrough_cache_enable=0 obdfilter.*.brw_size=16

Run IOR from 32 clients, 256 process into single OST
[root@c01 ~]# salloc -N 32 --ntasks-per-node=8 mpirun -np 256 --allow-run-as-root /work/tools/bin/ior -w -t 1m -b 1g -e -F -v -C -Q 24 -o /scratch0/ost0/file

Max Write: 8461.94 MiB/sec (8872.99 MB/sec)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I saw good merged IOs (16MB IOs on storage side) in this case.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@es18k-vm11 ~]# blktrace /dev/sda -a issue -a complete -o - | blkiomon -I 120 -h -
time: Sun Apr  7 14:19:45 2019
device: 8,0
sizes read (bytes): num 0, min -1, max 0, sum 0, squ 0, avg 0.0, var 0.0
sizes write (bytes): num 17619, min 4096, max 16777216, sum 274927271936, squ 4533348865750859776, avg 15604022.5, var 257270866642200.3
d2c read (usec): num 0, min -1, max 0, sum 0, squ 0, avg 0.0, var 0.0
d2c write (usec): num 17619, min 48, max 6425912, sum 8433387602, squ 5102891088905970, avg 478653.0, var 238785697942.2
throughput read (bytes/msec): num 0, min -1, max 0, sum 0, squ 0, avg 0.0, var 0.0
throughput write (bytes/msec): num 17619, min 0, max 4273907, sum 464644931, squ 427646612524675, avg 26371.8, var 23576427970.2
sizes histogram (bytes):
            0:     0         1024:     0         2048:     0         4096:   153
         8192:    11        16384:     8        32768:    12        65536:    18
       131072:    27       262144:    31       524288:    79      1048576:    77
      2097152:    21      4194304:   458      8388608:   742    &amp;gt; 8388608: 15982 &amp;lt;--- good 16M IOs here
d2c histogram (usec):
            0:     0            8:     0           16:     0           32:     0
           64:    43          128:    36          256:    57          512:    90
         1024:    32         2048:    18         4096:     2         8192:    12
        16384:    31        32768:   169        65536:   339       131072:  1528
       262144:  1009       524288:  5433      1048576:  8751      2097152:    53
      4194304:    13      8388608:     3     16777216:     0     33554432:     0
    &amp;gt;33554432:     0
bidirectional requests: 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Enabling PageCaches &lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@es18k-vm11 ~]# lctl set_param osd-ldiskfs.*.read_cache_enable=1 obdfilter.*.writethrough_cache_enable=1 obdfilter.*.brw_size=16 osd-ldiskfs.*.readcache_max_filesize=1048576
[root@c01 ~]# salloc -N 32 --ntasks-per-node=8 mpirun -np 256 --allow-run-as-root /work/tools/bin/ior -w -t 1m -b 1g -e -F -v -C -Q 24 -o /scratch0/ost0/file

Max Write: 6778.19 MiB/sec (7107.45 MB/sec) &amp;lt;-- 25% lower performance than non pagecache mode
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;blktrace tells us there are a lot of 8MB IOs rather than 16M IOs here.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@es18k-vm11 ~]# blktrace /dev/sda -a issue -a complete -o - | blkiomon -I 120 -h -

time: Sun Apr  7 14:26:55 2019
device: 8,0
sizes read (bytes): num 0, min -1, max 0, sum 0, squ 0, avg 0.0, var 0.0
sizes write (bytes): num 30756, min 4096, max 16777216, sum 274926415872, squ 2835130225468112896, avg 8938952.3, var 92172676586037.2
d2c read (usec): num 0, min -1, max 0, sum 0, squ 0, avg 0.0, var 0.0
d2c write (usec): num 30756, min 31, max 1827037, sum 19563303393, squ 14554226996221365, avg 636080.9, var 458639796658.0
throughput read (bytes/msec): num 0, min -1, max 0, sum 0, squ 0, avg 0.0, var 0.0
throughput write (bytes/msec): num 30756, min 2, max 4413081, sum 841918911, squ 803811743497523, avg 27374.1, var 25385776471.6
sizes histogram (bytes):
            0:     0         1024:     0         2048:     0         4096:   114
         8192:     6        16384:     6        32768:     8        65536:     7
       131072:    22       262144:    44       524288:   110      1048576:   398
      2097152:   442      4194304:  1735      8388608: 20107    &amp;gt; 8388608:  7757
d2c histogram (usec):
            0:     0            8:     0           16:     0           32:     1
           64:    34          128:    21          256:    75          512:    84
         1024:    97         2048:   158         4096:    22         8192:    41
        16384:   213        32768:   370        65536:   279       131072:   633
       262144:  1093       524288:  4855      1048576: 21389      2097152:  1391
      4194304:     0      8388608:     0     16777216:     0     33554432:     0
    &amp;gt;33554432:     0
bidirectional requests: 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="245427" author="bzzz" created="Mon, 8 Apr 2019 15:44:00 +0000"  >&lt;p&gt;thanks for the report. I need some time to analyze this. my first guess is that pagecache overhead (need to allocate pages, scanning to release old pages) introduces additional gaps between I/Os, but this is not true as each thread allocates all needed pages and only after that submit I/Os. so it must be something else.&lt;/p&gt;</comment>
                            <comment id="252824" author="gerrit" created="Fri, 9 Aug 2019 04:39:31 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/33521/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33521/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11221&quot; title=&quot;Do not hold pages locked for network IO on the server.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11221&quot;&gt;&lt;del&gt;LU-11221&lt;/del&gt;&lt;/a&gt; osd: allow concurrent bulks from pagecache&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 0a92632538d8c985e024def73512d18d1570d5ca&lt;/p&gt;</comment>
                            <comment id="252873" author="pjones" created="Fri, 9 Aug 2019 15:06:01 +0000"  >&lt;p&gt;Landed for 2.13&lt;/p&gt;</comment>
                            <comment id="252893" author="shadow" created="Fri, 9 Aug 2019 16:09:45 +0000"  >&lt;p&gt;Oleg, Alex - why PG_writeback isn&apos;t used for this case?&lt;br/&gt;
from other side have pages unlocked open race with truncate from same host.&lt;/p&gt;</comment>
                            <comment id="257036" author="gerrit" created="Thu, 24 Oct 2019 20:42:40 +0000"  >&lt;p&gt;Minh Diep (mdiep@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36570&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36570&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11221&quot; title=&quot;Do not hold pages locked for network IO on the server.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11221&quot;&gt;&lt;del&gt;LU-11221&lt;/del&gt;&lt;/a&gt; osd: allow concurrent bulks from pagecache&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 7c0b6676a1b1d3b453a3d3888e26d15b7bf07b44&lt;/p&gt;</comment>
                            <comment id="258617" author="gerrit" created="Thu, 21 Nov 2019 07:35:18 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/36570/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36570/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11221&quot; title=&quot;Do not hold pages locked for network IO on the server.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11221&quot;&gt;&lt;del&gt;LU-11221&lt;/del&gt;&lt;/a&gt; osd: allow concurrent bulks from pagecache&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: efcdfe9e075fdfa334d16bcb53399f2978c16d42&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="44881">LU-9232</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i000db:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>