<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:27:24 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-9574] Large file read performance degradation from multiple OST&apos;s</title>
                <link>https://jira.whamcloud.com/browse/LU-9574</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We recently noticed that the large file read performance on our 2.9 LFS is dramatically worse than it used to be. The attached plot is the result of a test script that uses dd to write a large file (50GB) to disk, read that file and then copy it to a 2nd file to test write, read and read/write speeds for large files for various stripe sizes and counts. The two sets of data on this plot are on the same server and client hardware. The LFS was originally built and formatted with 2.8.0 but we eventually upgraded to 2.9.0 on the servers and clients. The behavior we are used to seeing is increasing performance as you increase the stripe count with a peak in performance around 4 or 6 OST&apos;s and a degradation after that as more OST&apos;s are used. This is what we saw under 2.8 (red lines in the plots). With 2.9 we still get very good write performance (almost line rate on our 10 GbE clients). But for reads we see extremely good performance with a single OST and significantly degraded performance for multiple OST&apos;s &#8211; black&#160;lines in the plots. &#160;Using a git bisect to compile and test different clients, we were able to isolate it to this commit:&lt;/p&gt;

&lt;p&gt;commit d8467ab8a2ca15fbbd5be3429c9cf9ceb0fa78b8&lt;br/&gt;
 &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7990&quot; title=&quot; Large bulk IO support&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7990&quot;&gt;&lt;del&gt;LU-7990&lt;/del&gt;&lt;/a&gt; clio: revise readahead to support 16MB IO&lt;/p&gt;

&lt;p&gt;There is slightly more info here:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2017-May/014509.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2017-May/014509.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please let me know if you need any other data or info. &#160;&#160;&lt;/p&gt;</description>
                <environment>RHEL 7 servers, RHEL 6 and 7 clients.  </environment>
        <key id="46368">LU-9574</key>
            <summary>Large file read performance degradation from multiple OST&apos;s</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="jay">Jinshan Xiong</assignee>
                                    <reporter username="dvicker">Darby Vicker</reporter>
                        <labels>
                    </labels>
                <created>Tue, 30 May 2017 19:55:05 +0000</created>
                <updated>Sun, 8 Apr 2018 12:49:52 +0000</updated>
                            <resolved>Thu, 21 Sep 2017 12:17:31 +0000</resolved>
                                    <version>Lustre 2.9.0</version>
                                    <fixVersion>Lustre 2.11.0</fixVersion>
                    <fixVersion>Lustre 2.10.2</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>11</watches>
                                                                            <comments>
                            <comment id="197593" author="jay" created="Tue, 30 May 2017 20:10:35 +0000"  >&lt;p&gt;Is this a reproduction of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9214&quot; title=&quot;no readahead for small max_read_ahead_per_file_mb&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9214&quot;&gt;&lt;del&gt;LU-9214&lt;/del&gt;&lt;/a&gt;? That is a side-effect when &lt;tt&gt;ra_per_file_mb&lt;/tt&gt; is less than optimal RPC size.&lt;/p&gt;

&lt;p&gt;Please try patch &lt;a href=&quot;https://review.whamcloud.com/25996&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/25996&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="197614" author="dvicker" created="Tue, 30 May 2017 22:03:28 +0000"  >&lt;p&gt;No, I don&apos;t think so (although we suspected the same for a while too). &#160;I applied patch 25996 against 2.9.0 and compiled a client but still see slow read performance on multiple OST&apos;s.&#160;&lt;/p&gt;

&lt;p&gt;On the other hand, we were able to revert commit&#160;d8467ab8a2ca15fbbd5be3429c9cf9ceb0fa78b8 from the 2.9.0 branch and recover the old behavior (fast reads and writes on multiple OST&apos;s). &#160;I can upload more plots of this if you&apos;d like. &#160;&lt;/p&gt;</comment>
                            <comment id="197635" author="jay" created="Tue, 30 May 2017 23:45:07 +0000"  >&lt;p&gt;Can you collect a debug log with READA enabled so that I can take a look?&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl set_param debug=&quot;reada vfstrace&quot;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="197728" author="jay" created="Wed, 31 May 2017 19:04:11 +0000"  >&lt;p&gt;Also please tell me the RPC size for each OSC and the stripe size of the test file.&lt;/p&gt;</comment>
                            <comment id="197745" author="dvicker" created="Wed, 31 May 2017 22:26:37 +0000"  >&lt;p&gt;Sorry for the delay - I ran into some issues that took a little while to figure out. &#160;&lt;/p&gt;

&lt;p&gt;I took the debug logs using the following proceedure:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;mount -t lustre 192.52.98.30@tcp:192.52.98.31@tcp:/hpfs-fsl /tmp/lustre_test/&lt;/li&gt;
	&lt;li&gt;lctl set_param debug=&quot;reada vfstrace&quot;&lt;/li&gt;
	&lt;li&gt;lctl dk debug.aftermount.log&lt;/li&gt;
	&lt;li&gt;./test_ss_sn.sh&lt;/li&gt;
	&lt;li&gt;lctl dk debug.aftertest.log&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The test_ss_sn.sh script is what&apos;s driving the attached plots - it writes, reads and copies a large file (~50 GB) with various stripe sizes and counts using dd. &#160;I can provide more details if needed. &#160;I&apos;m doing a short version for this - just 1m and 4m stripe sizes with 1 and 4 stripe counts. &#160;&lt;/p&gt;

&lt;p&gt;I did this for two different lustre clients - the first with 2.9.0 + the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9214&quot; title=&quot;no readahead for small max_read_ahead_per_file_mb&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9214&quot;&gt;&lt;del&gt;LU-9214&lt;/del&gt;&lt;/a&gt; patch (25996). &#160;The second with 2.9.0 with commit d8467ab reverted. I&apos;m attaching the debug logs and the plots of all of this. &#160;The &quot;lctl set_param&quot; above is affecting the performance quite a bit (both read and write) so I&apos;ve include results with and without debug logs turned on. &#160;&#160;&lt;/p&gt;</comment>
                            <comment id="197810" author="jay" created="Thu, 1 Jun 2017 16:59:30 +0000"  >&lt;p&gt;Yes it has significant performance loss for the read case.&lt;/p&gt;

&lt;p&gt;Unfortunately the debug log didn&apos;t catch any useful information about readahead. It was overwritten by the last write in your test case. The readahead log can be captured by reading an existing file in the file system and then dump the log, as follows:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl set_param debug=&quot;reada vfstrace&quot;
dd if=&amp;lt;lustre_file&amp;gt; of=/dev/null bs=...
lctl dk &amp;gt; log
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It would be helpful to post the test script here so that I can see the stripe options.&lt;/p&gt;</comment>
                            <comment id="197811" author="dvicker" created="Thu, 1 Jun 2017 17:11:40 +0000"  >&lt;p&gt;OK, I&apos;ll redo the tests and dump the logs after reading existing files. &#160;I&apos;ve attached the script so you can see exactly what we&apos;re doing, including the stripe options. &#160;&lt;/p&gt;</comment>
                            <comment id="197812" author="jay" created="Thu, 1 Jun 2017 17:24:14 +0000"  >&lt;p&gt;Please also show me the output of:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl get_param osc.*.max_pages_per_rpc
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;from the client node.&lt;/p&gt;</comment>
                            <comment id="197818" author="jay" created="Thu, 1 Jun 2017 18:03:15 +0000"  >&lt;p&gt;I can reproduce this problem by using 25M block size in dd test. Probably the current algorithm can&apos;t pipeline readahead stream well w/ large read size.&lt;/p&gt;

&lt;p&gt;Can you try to use block size 1MB in your dd test and see if you can still reproduce it?&lt;/p&gt;</comment>
                            <comment id="197829" author="dvicker" created="Thu, 1 Jun 2017 20:57:59 +0000"  >&lt;p&gt;Here are the results with the 2.9.0 + patch 25996 client. &#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@dvicker ~]# lctl set_param debug=&quot;reada vfstrace&quot;
debug=reada vfstrace
[root@dvicker ~]# dd if=/nobackup/.test/dvicker-1m-04nodes.keep of=/dev/null bs=25M 
2000+0 records in
2000+0 records out
52428800000 bytes (52 GB) copied, 300.815 s, 174 MB/s
[root@dvicker ~]# lctl dk &amp;gt; debug.4node.2.9.0_patch25996.log
[root@dvicker ~]# dd if=/nobackup/.test/dvicker-1m-01nodes.keep of=/dev/null bs=25M 
2000+0 records in
2000+0 records out
52428800000 bytes (52 GB) copied, 56.6835 s, 925 MB/s
[root@dvicker ~]# lctl dk &amp;gt; debug.1node.2.9.0_patch25996.log
[root@dvicker ~]# dd if=/nobackup/.test/dvicker-1m-04nodes.keep of=/dev/null bs=1M 
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB) copied, 513.064 s, 102 MB/s
[root@dvicker ~]# lctl dk &amp;gt; debug.4node_bs_1m.2.9.0_patch25996.log
[root@dvicker ~]# dd if=/nobackup/.test/dvicker-1m-01nodes.keep of=/dev/null bs=1M 
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB) copied, 45.1046 s, 1.2 GB/s
[root@dvicker ~]# lctl dk &amp;gt; debug.1node_bs_1m.2.9.0_patch25996.log

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;So it looks like the block size doesn&apos;t make a big difference in our case. Do you also want the debug logs for the 2.9.0 client with commit&#160;d8467ab reverted?&lt;/p&gt;</comment>
                            <comment id="197831" author="dvicker" created="Thu, 1 Jun 2017 21:04:34 +0000"  >&lt;p&gt;And here is the other output you asked for. &#160;The hpfs-fsl is our 2.9 LFS I&apos;ve we&apos;ve been benchmarking here. &#160;The hpfs2eg3 is an older (2.4) LFS we are about to retire. &#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@dvicker ~]# lctl get_param osc.*.max_pages_per_rpc
osc.hpfs-fsl-OST0000-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST0000-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST0001-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST0001-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST0002-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST0002-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST0003-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST0003-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST0004-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST0004-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST0005-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST0005-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST0006-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST0006-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST0007-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST0007-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST0008-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST0008-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST0009-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST0009-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST000a-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST000a-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs-fsl-OST000b-osc-ffff88085bd08000.max_pages_per_rpc=256
osc.hpfs-fsl-OST000b-osc-ffff8810566ba800.max_pages_per_rpc=256
osc.hpfs2eg3-OST0000-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0001-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0002-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0003-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0004-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0005-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0006-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0007-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0008-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0009-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST000a-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST000b-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST000c-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST000d-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST000e-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST000f-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0010-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0011-osc-ffff88082bfa9800.max_pages_per_rpc=1024
osc.hpfs2eg3-OST0012-osc-ffff88082bfa9800.max_pages_per_rpc=1024
[root@dvicker ~]# mount -t lustre
192.52.98.142@tcp:/hpfs2eg3 on /lustre2 type lustre (rw,flock,lazystatfs)
192.52.98.30@tcp:192.52.98.31@tcp:/hpfs-fsl/work on /nobackup type lustre (rw,flock,lazystatfs)
[root@dvicker ~]# 

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="197846" author="gerrit" created="Thu, 1 Jun 2017 22:44:25 +0000"  >&lt;p&gt;Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/27388&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/27388&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9574&quot; title=&quot;Large file read performance degradation from multiple OST&amp;#39;s&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9574&quot;&gt;&lt;del&gt;LU-9574&lt;/del&gt;&lt;/a&gt; llite: pipeline readahead better with large I/O&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: a1ed8000d29f00a2933df125e3b8188c5da3d883&lt;/p&gt;</comment>
                            <comment id="197849" author="jay" created="Thu, 1 Jun 2017 23:21:42 +0000"  >&lt;p&gt;The above patch is helpful on my side. However, I can see something weird is going on from the log you attached. I&apos;m looking at it but meanwhile please try the patch.&lt;/p&gt;</comment>
                            <comment id="197850" author="jay" created="Thu, 1 Jun 2017 23:46:58 +0000"  >&lt;p&gt;Have you applied any patches by your own? It seems like some extents of the file are skipped so the read pattern is detected as stride read.&lt;/p&gt;</comment>
                            <comment id="197851" author="jay" created="Fri, 2 Jun 2017 00:01:20 +0000"  >&lt;p&gt;This is the log in question:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000080:00400000:9.0:1496349067.442741:0:30939:0:(rw.c:490:ll_readahead()) lrp 3792639 cr 1 cp 512 ws 3792384 wl 0 nra 3792640 rpc 256 r 7408 ri 256 csr 0 sf 3791104 sp 0 sl 0
00000080:00400000:9.0:1496349067.442743:0:30939:0:(rw.c:1133:ll_io_read_page()) [0x2000120e5:0x44:0x0]0 pages read ahead at 3792639
00000080:00200000:9.0:1496349067.442753:0:30939:0:(vvp_io.c:308:vvp_io_fini()) [0x2000120e5:0x44:0x0] ignore/verify layout 0/0, layout version 0 restore needed 0
00000080:00200000:9.0:1496349067.442756:0:30939:0:(file.c:1225:ll_file_io_generic()) iot: 0, result: 1048576
00000080:00200000:9.0:1496349067.443350:0:30939:0:(file.c:1124:ll_file_io_generic()) file: dvicker-1m-04nodes.keep, type: 0 ppos: 15536750592, count: 1048576
00000020:00200000:9.0:1496349067.443355:0:30939:0:(cl_io.c:235:cl_io_rw_init()) header@ffff880fb6655290[0x0, 7427329, [0x2000120e5:0x44:0x0] hash]
                                                                                
00000020:00200000:9.0:1496349067.443356:0:30939:0:(cl_io.c:235:cl_io_rw_init()) io range: 0 [15536750592, 15537799168) 0 0
00000080:00200000:9.0:1496349067.443358:0:30939:0:(vvp_io.c:1381:vvp_io_init()) [0x2000120e5:0x44:0x0] ignore/verify layout 0/0, layout version 0 restore needed 0
00020000:00200000:9.0:1496349067.443361:0:30939:0:(lov_io.c:448:lov_io_rw_iter_init()) stripe: 14817 chunk: [15536750592, 15537799168) 15537799168
00020000:00200000:9.0:1496349067.443364:0:30939:0:(lov_io.c:413:lov_io_iter_init()) shrink: 1 [3883925504, 3884974080)
00000080:00200000:9.0:1496349067.443367:0:30939:0:(vvp_io.c:230:vvp_io_one_lock_index()) lock: 0 [3793152, 3793407]
00000080:00200000:9.0:1496349067.443376:0:30939:0:(vvp_io.c:725:vvp_io_read_start()) read: -&amp;gt; [15536750592, 15537799168)
00000080:00400000:9.0:1496349067.443383:0:30939:0:(rw.c:741:ras_update()) [0x2000120e5:0x44:0x0] pages at 3793152 miss.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="198113" author="dvicker" created="Mon, 5 Jun 2017 13:32:01 +0000"  >&lt;p&gt;The only patches we&apos;ve applied are either 25996 (i.e. &lt;a href=&quot;https://review.whamcloud.com/#/c/25996/)&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/25996/)&lt;/a&gt;&#160;or reverting commit&#160;d8467ab8a2ca15fbbd5be3429c9cf9ceb0fa78b8. &#160;Beyond that, we haven&apos;t tried any other patches. &#160;I will try patch 27388 and let you know what I see. &#160;&lt;/p&gt;</comment>
                            <comment id="198120" author="dvicker" created="Mon, 5 Jun 2017 14:19:44 +0000"  >&lt;p&gt;Patch 27388 looks good. &#160;I applied this patch to 2.9.0 and our dd tests show good write and read performance again. &#160;Let me know if you want debug logs or performance plots. &#160;&lt;/p&gt;</comment>
                            <comment id="198159" author="jay" created="Mon, 5 Jun 2017 17:20:43 +0000"  >&lt;p&gt;Thanks for confirming. Now that I can reproduce the issue on my side, I don&apos;t need the debug log.&lt;/p&gt;</comment>
                            <comment id="198187" author="dvicker" created="Mon, 5 Jun 2017 20:29:27 +0000"  >&lt;p&gt;I&apos;m sorry but it looks like I gave you some bad info. &#160;Patch 27388 is inconsistent for us on the read performance. &#160;Now that we&apos;ve done some more more extensive testing, it is occasionally fast and occasionally slow. &#160;Reverting&#160;commit&#160;d8467ab is consistently fast for both read and write. &#160;&lt;/p&gt;</comment>
                            <comment id="198204" author="jay" created="Mon, 5 Jun 2017 21:38:49 +0000"  >&lt;p&gt;No problem. From what I have seen in the log you posted, I can see some file extents are skipped so the read was detected as stride read.&lt;/p&gt;

&lt;p&gt;Please collect some log with patch 27388 applied. &lt;/p&gt;</comment>
                            <comment id="198216" author="dvicker" created="Mon, 5 Jun 2017 22:38:40 +0000"  >&lt;p&gt;OK, here is a debug log with 2.9.0 + patch 27388. &#160;I stopped the read early (notice the ^C) so I wouldn&apos;t have to wait for the entire 52 GB file to read. &#160;I don&apos;t think that matters but just wanted to mention it to make sure. &#160;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@benkirk ~]# lctl set_param debug=&quot;reada vfstrace&quot;
debug=reada vfstrace
[root@benkirk ~]# dd if=/nobackup/.test/dvicker-1m-04nodes.keep of=/dev/null bs=25M
^C153+0 records in
152+0 records out
3984588800 bytes (4.0 GB) copied, 158.505 s, 25.1 MB/s

[root@benkirk ~]# lctl dk &amp;gt; debug.4node.2.9.0_patch27388.log 
[root@benkirk ~]#

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="198653" author="dvicker" created="Thu, 8 Jun 2017 16:50:33 +0000"  >&lt;p&gt;I see a new patch was uploaded to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9214&quot; title=&quot;no readahead for small max_read_ahead_per_file_mb&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9214&quot;&gt;&lt;del&gt;LU-9214&lt;/del&gt;&lt;/a&gt;. &#160;Based on the debug logs you&apos;ve seen, do you think this still might be related? &#160;I can give the new patch a try if you think it might help. &#160;&lt;/p&gt;</comment>
                            <comment id="204516" author="dvicker" created="Fri, 4 Aug 2017 20:36:59 +0000"  >&lt;p&gt;Just checking in on this. &#160;We&apos;ve upgraded to 2.10 and we still had to revert&#160;d8467ab to keep from running into this issue. &#160;&lt;/p&gt;</comment>
                            <comment id="205906" author="jay" created="Mon, 21 Aug 2017 19:26:09 +0000"  >&lt;p&gt;I&apos;m looking at the debug - sorry for delay response.&lt;/p&gt;</comment>
                            <comment id="205930" author="gerrit" created="Mon, 21 Aug 2017 22:54:15 +0000"  >&lt;p&gt;.&lt;/p&gt;</comment>
                            <comment id="205932" author="jay" created="Mon, 21 Aug 2017 23:10:43 +0000"  >&lt;p&gt;I saw some weird thing in the log message:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00020000:00200000:43.0:1496701931.619586:0:3314:0:(lov_io.c:448:lov_io_rw_iter_init()) stripe: 1908 chunk: [2000683008, 2001731584) 2018508800
00020000:00200000:43.0:1496701931.619592:0:3314:0:(lov_io.c:413:lov_io_iter_init()) shrink: 0 [500170752, 501219328)
00000080:00200000:43.0:1496701931.619595:0:3314:0:(vvp_io.c:230:vvp_io_one_lock_index()) lock: 0 [488448, 488703]
00000080:00200000:43.0:1496701931.619610:0:3314:0:(vvp_io.c:725:vvp_io_read_start()) read: -&amp;gt; [2000683008, 2001731584)
00020000:00200000:43.0:1496701931.620038:0:3314:0:(lov_io.c:448:lov_io_rw_iter_init()) stripe: 1909 chunk: [2001731584, 2002780160) 2018508800
00020000:00200000:43.0:1496701931.620042:0:3314:0:(lov_io.c:413:lov_io_iter_init()) shrink: 1 [500170752, 501219328)
00000080:00200000:43.0:1496701931.620044:0:3314:0:(vvp_io.c:230:vvp_io_one_lock_index()) lock: 0 [488704, 488959]
00000080:00200000:43.0:1496701931.620052:0:3314:0:(vvp_io.c:725:vvp_io_read_start()) read: -&amp;gt; [2001731584, 2002780160)
00020000:00200000:43.0:1496701931.620456:0:3314:0:(lov_io.c:448:lov_io_rw_iter_init()) stripe: 1910 chunk: [2002780160, 2003828736) 2018508800
00020000:00200000:43.0:1496701931.620459:0:3314:0:(lov_io.c:413:lov_io_iter_init()) shrink: 2 [500170752, 501219328)
00000080:00200000:43.0:1496701931.620461:0:3314:0:(vvp_io.c:230:vvp_io_one_lock_index()) lock: 0 [488960, 489215]
00000080:00200000:43.0:1496701931.620470:0:3314:0:(vvp_io.c:725:vvp_io_read_start()) read: -&amp;gt; [2002780160, 2003828736)
00000080:00400000:43.0:1496701931.620478:0:3314:0:(rw.c:733:ras_update()) [0x2000120e5:0x44:0x0] pages at 488960 miss.
00000080:00400000:43.0:1496701931.620480:0:3314:0:(rw.c:794:ras_update()) lrp 488447 cr 0 cp 0 ws 488192 wl 7424 nra 488447 rpc 256 r 22 ri 1024 csr 128 sf 359936 sp 512 sl 1024
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;When the page 488960 was being read, the previous read requests to &lt;span class=&quot;error&quot;&gt;&amp;#91;488704, 488959&amp;#93;&lt;/span&gt; and &lt;span class=&quot;error&quot;&gt;&amp;#91;488704, 488959&amp;#93;&lt;/span&gt; were skipped, probably because it has found cached pages in memory, which confused readahead algorithm and detected it as stride read.&lt;/p&gt;

&lt;p&gt;This pattern actually occurred all the time, and I think the skipped reading extents always belong to stripe 0 &amp;amp; 1. Was the application doing reread and somehow the cached pages for stripe 2 &amp;amp; 3 were cleaned up?&lt;/p&gt;

&lt;p&gt;Anyway, I still found a problem in stride read so please try the patch set 2 of &lt;a href=&quot;https://review.whamcloud.com/27388&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/27388&lt;/a&gt; with debug level setting to &quot;vfstrace reada&quot;, and please post the debug log here.&lt;/p&gt;</comment>
                            <comment id="206059" author="dvicker" created="Tue, 22 Aug 2017 20:04:02 +0000"  >&lt;p&gt;I just applied that patch on top of 2.10.0 and tested. &#160;Our full suite of dd tests (test_ss_sn.sh) looked good - read and writes were fast across the board. &#160;The stock 2.10.0 client was slow for reads (as mentioned in the Aug 4 post) so I think this&#160;patch fixes our problem. &#160;I&apos;ve attached the debug log from this:&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@dvicker .test]# lctl set_param debug=&quot;reada vfstrace&quot; debug=reada vfstrace 
[root@dvicker .test]# dd if=/nobackup/.test/dvicker-1m-04nodes.keep of=/dev/null bs=25M 
2000+0 records in 
2000+0 records out 52428800000 bytes (52 GB) copied, 54.5779 s, 961 MB/s 
[root@dvicker .test]# lctl dk &amp;gt; debug.4node.2.10.0_patch27388_set2.log

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="208501" author="gerrit" created="Fri, 15 Sep 2017 15:19:22 +0000"  >&lt;p&gt;Minh Diep (minh.diep@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/29016&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29016&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9574&quot; title=&quot;Large file read performance degradation from multiple OST&amp;#39;s&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9574&quot;&gt;&lt;del&gt;LU-9574&lt;/del&gt;&lt;/a&gt; llite: pipeline readahead better with large I/O&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 83946d6f389a5c047cc958973c4b2cdf291a6429&lt;/p&gt;</comment>
                            <comment id="208999" author="gerrit" created="Thu, 21 Sep 2017 06:13:07 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/27388/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/27388/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9574&quot; title=&quot;Large file read performance degradation from multiple OST&amp;#39;s&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9574&quot;&gt;&lt;del&gt;LU-9574&lt;/del&gt;&lt;/a&gt; llite: pipeline readahead better with large I/O&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 011742134e3152f3e389ec30c08ccfc28d7a91a7&lt;/p&gt;</comment>
                            <comment id="209030" author="pjones" created="Thu, 21 Sep 2017 12:17:31 +0000"  >&lt;p&gt;Landed for 2.11&lt;/p&gt;</comment>
                            <comment id="211876" author="gerrit" created="Tue, 24 Oct 2017 21:38:30 +0000"  >&lt;p&gt;John L. Hammond (john.hammond@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/29016/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29016/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9574&quot; title=&quot;Large file read performance degradation from multiple OST&amp;#39;s&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9574&quot;&gt;&lt;del&gt;LU-9574&lt;/del&gt;&lt;/a&gt; llite: pipeline readahead better with large I/O&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 73a7af47a01979d5dd5c7e8dcaf3f62d47f11b0b&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="26866" name="debug.1node.2.9.0_patch25996.log.gz" size="1889492" author="dvicker" created="Thu, 1 Jun 2017 20:59:31 +0000"/>
                            <attachment id="26865" name="debug.1node_bs_1m.2.9.0_patch25996.log.gz" size="1136403" author="dvicker" created="Thu, 1 Jun 2017 20:59:30 +0000"/>
                            <attachment id="28046" name="debug.4node.2.10.0_patch27388_set2.log" size="5730338" author="dvicker" created="Tue, 22 Aug 2017 19:56:23 +0000"/>
                            <attachment id="26868" name="debug.4node.2.9.0_patch25996.log.gz" size="3266554" author="dvicker" created="Thu, 1 Jun 2017 20:59:32 +0000"/>
                            <attachment id="26918" name="debug.4node.2.9.0_patch27388.log.gz" size="5269862" author="dvicker" created="Mon, 5 Jun 2017 22:39:38 +0000"/>
                            <attachment id="26867" name="debug.4node_bs_1m.2.9.0_patch25996.log.gz" size="3825392" author="dvicker" created="Thu, 1 Jun 2017 20:59:32 +0000"/>
                            <attachment id="26853" name="debug.aftermount.2.9.0_patch25996.log.gz" size="10349" author="dvicker" created="Wed, 31 May 2017 22:27:31 +0000"/>
                            <attachment id="26854" name="debug.aftermount.2.9.0_revert_d8467ab.log.gz" size="10247" author="dvicker" created="Wed, 31 May 2017 22:27:31 +0000"/>
                            <attachment id="26855" name="debug.aftertest.2.9.0_patch25996.log.gz" size="4250296" author="dvicker" created="Wed, 31 May 2017 22:27:35 +0000"/>
                            <attachment id="26856" name="debug.aftertest.2.9.0_revert_d8467ab.log.gz" size="4343066" author="dvicker" created="Wed, 31 May 2017 22:27:36 +0000"/>
                            <attachment id="26857" name="lustre_performance.pdf" size="19587" author="dvicker" created="Wed, 31 May 2017 22:27:31 +0000"/>
                            <attachment id="26864" name="test_ss_sn.sh" size="3028" author="dvicker" created="Thu, 1 Jun 2017 17:12:34 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10030" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic/Theme</customfieldname>
                        <customfieldvalues>
                                        <label>Performance</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzzdw7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>