<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:06:45 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-410] Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 </title>
                <link>https://jira.whamcloud.com/browse/LU-410</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Running obdfilter-survey with &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; patches on DDN SFA 10K it appeared the IO from Lustre to disk was not aligned because the sizes were observed to be 1020K and 4k.  As the file size exceeded cache the performance issue was very aparent.  Setting vm.min_free_kbyte did not help this performance issue at all. For example using obdfilter-survey to write an 8GB file to each OST woudl show approximately 30% unaligned I/O.  The alingment issue was seen by observing cache statics on the DDN SFA 10K controller.   &lt;/p&gt;

&lt;p&gt;Once we remove the shrink file_max_cache patch &lt;span class=&quot;error&quot;&gt;&amp;#91;define FILTER_MAX_CACHE_SIZE (8 * 1024 * 1024)&amp;#93;&lt;/span&gt; the alignment issue goes away.  The many unaligned IO seems to be caused by this change in this patch and once I changed cache_file_size to 18446744073709551615 (which is 1.8.4 and 1.8.5 deafult), all IO were comming to SFA10K as aligned I/O.&lt;/p&gt;

&lt;p&gt;Disabling the read cache (lctl set_param=obdfilter.*.read_cache_enable=0) doesn&apos;t help which is still very strange to me..&lt;/p&gt;

&lt;p&gt;The only work around we have found is changing cache_file_size to a large size is only way to avoid this issue 1.8.6WC.  This could have other performance implications as well.   &lt;/p&gt;

&lt;p&gt;We hope to post some numbers and statitics but we need additional runs to gather that information.  &lt;/p&gt;</description>
                <environment>Tested on DDN SFA 10K with Infiniband and patches for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;strike&gt;LU-15&lt;/strike&gt;&lt;/a&gt; applied to 1.8.4.  This testing started before 1.8.6 was tagged.</environment>
        <key id="11158">LU-410</key>
            <summary>Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 </summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="di.wang">Di Wang</assignee>
                                    <reporter username="jsalinas">John Salinas</reporter>
                        <labels>
                    </labels>
                <created>Mon, 13 Jun 2011 12:08:21 +0000</created>
                <updated>Tue, 5 Jan 2021 19:23:09 +0000</updated>
                            <resolved>Fri, 10 Aug 2012 11:30:23 +0000</resolved>
                                    <version>Lustre 1.8.6</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>14</watches>
                                                                            <comments>
                            <comment id="16126" author="di.wang" created="Mon, 13 Jun 2011 17:01:14 +0000"  >&lt;p&gt;Does brw_stats show the similar information? Could you please post the parameters you use for obdfilter-survey?&lt;/p&gt;</comment>
                            <comment id="16130" author="di.wang" created="Mon, 13 Jun 2011 18:19:56 +0000"  >&lt;p&gt;If you did not set tests_str in obdfilter-survey, it should be run in 3 phases, &quot;write, rewrite, read&quot;, then which phase did you see these un-alignment IO in? Or all three 3 phases. Thanks.&lt;/p&gt;</comment>
                            <comment id="16134" author="ihara" created="Mon, 13 Jun 2011 19:31:04 +0000"  >&lt;p&gt;will send you brw_starts and more statitics later, but this problem doesn&apos;t only happen during obdfilter-survery, but also we saw un-alignment IO during the write when we run IOR from the lustre clients.&lt;/p&gt;</comment>
                            <comment id="16556" author="ihara" created="Sat, 18 Jun 2011 08:47:10 +0000"  >&lt;p&gt;Sorry for late response of this.. &lt;br/&gt;
Just tested this again. The test envirment is very simple. 1 x SFA10K and 1 x OSS. I created a new OST on SFA10K and started it on single OSS, then run obdfilter-survery. I used the latest 1.8.6WC.rc build. When readcache_max_filesize=8388608, we see many 4K I/O on brw_stats compared with when I did readcache_max_filesize=18446744073709551615.&lt;/p&gt;

&lt;p&gt;Also, with readcache_max_filesize=8388608, I saw many not aligned I/O (aligned I/O means 1MB x N) on SFA10K (collected IO size on SFA10K inside). It kills the write perfoarmnce if many not aligned I/O are comming instead of aligned I/O. Becouse, SFA10K does chaching I/O if IO size is not aligned. That is why I can see better perfoarmnce with readcache_max_filesize=18446744073709551615 than readcache_max_filesize=8388608.&lt;/p&gt;

&lt;p&gt;Please see results in detail.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lctl get_param obdfilter.*.readcache_max_filesize
obdfilter.lustre-OST0000.readcache_max_filesize=8388608

# nobjlo=2 nobjhi=2 thrlo=128 thrhi=128 tests_str=&quot;write read&quot; /usr/bin/obdfilter-survey
Sat Jun 18 21:11:04 JST 2011 Obdfilter-survey for case=disk from r01
ost  1 sz 16777216K rsz 1024K obj    2 thr  128 write  478.35 [ 382.47, 638.33] read  761.16 [ 666.28, 956.03] 
done!

# cat /proc/fs/lustre/obdfilter/lustre-OST0000/brw_stats 
snapshot_time:         1308399155.300890 (secs.usecs)

                           read      |     write
pages per bulk r/w     rpcs  % cum % |  rpcs  % cum %
64:		         2   0   0   |    0   0   0
128:		         1   0   0   |    0   0   0
256:		     16367  99 100   | 16384 100 100

                           read      |     write
discontiguous pages    rpcs  % cum % |  rpcs  % cum %
0:		     16370 100 100   | 16384 100 100

                           read      |     write
discontiguous blocks   rpcs  % cum % |  rpcs  % cum %
0:		     16370 100 100   | 16384 100 100

                           read      |     write
disk fragmented I/Os   ios   % cum % |  ios   % cum %
1:		     14073  85  85   | 14431  88  88
2:		      2297  14 100   | 1953  11 100

                           read      |     write
disk I/Os in flight    ios   % cum % |  ios   % cum %
1:		         5   0   0   |   42   0   0
2:		         5   0   0   |   74   0   0
3:		         6   0   0   |   80   0   1
4:		         6   0   0   |  104   0   1
5:		         6   0   0   |  107   0   2
6:		         6   0   0   |  101   0   2
7:		         6   0   0   |   99   0   3
8:		         6   0   0   |  111   0   3
9:		         7   0   0   |  116   0   4
10:		         7   0   0   |  131   0   5
11:		         4   0   0   |  132   0   5
12:		         6   0   0   |  132   0   6
13:		         4   0   0   |  138   0   7
14:		         4   0   0   |  144   0   8
15:		         2   0   0   |  162   0   9
16:		         2   0   0   |  243   1  10
17:		         2   0   0   |  260   1  11
18:		         2   0   0   |  247   1  13
19:		         2   0   0   |  212   1  14
20:		         2   0   0   |  205   1  15
21:		         2   0   0   |  191   1  16
22:		         3   0   0   |  186   1  17
23:		         6   0   0   |  189   1  18
24:		         5   0   0   |  201   1  19
25:		         2   0   0   |  213   1  20
26:		         3   0   0   |  205   1  21
27:		         2   0   0   |  200   1  23
28:		         4   0   0   |  196   1  24
29:		         5   0   0   |  194   1  25
30:		         4   0   0   |  191   1  26
31:		     18541  99 100   | 13531  73 100

                           read      |     write
I/O time (1/1000s)     ios   % cum % |  ios   % cum %
1:		        17   0   0   |  139   0   0
2:		        24   0   0   |  106   0   1
4:		        16   0   0   |  102   0   2
8:		        23   0   0   |  120   0   2
16:		        29   0   0   |  216   1   4
32:		       123   0   1   |  791   4   8
64:		       912   5   6   | 3504  21  30
128:		      4002  24  31   | 5916  36  66
256:		      9809  59  91   | 4535  27  94
512:		      1339   8  99   |  927   5  99
1K:		        76   0 100   |   28   0 100

                           read      |     write
disk I/O size          ios   % cum % |  ios   % cum %
4K:		      2297  12  12   | 1953  10  10
8K:		         0   0  12   |    0   0  10
16K:		         0   0  12   |    0   0  10
32K:		         0   0  12   |    0   0  10
64K:		         0   0  12   |    0   0  10
128K:		         0   0  12   |    0   0  10
256K:		         2   0  12   |    0   0  10
512K:		         1   0  12   |    0   0  10
1M:		     16367  87 100   | 16384  89 100

Also, I collected SFA side statices. we can see what IO size are comming from the host.

----------------------------------------------------
Length           Port 0                 Port 1                 
Kbytes      Reads      Writes      Reads      Writes
----------------------------------------------------
     4          0           0       1211        2775 &amp;lt;------ many 4K I/O
     8          0           0          0          10
    12          0           0          0          15
    16          0           0          0           7
    20          0           0          0          14
    24          0           0          0         183
    28          0           0          0         546
    32          0           0          0         325
    36          0           0          0          36
    40          0           0          0           1
    44          0           0          0           4
    52          0           0          0           1
   208          0           0          1           0
   240          0           0          1           0
   416          0           0          1           0
   604          0           0          0           1
   640          0           0          1           0
   900          0           0          1           0
  1020          0           0       2297        1953 &amp;lt;------ many not aligned IO
  1024          0           0       9528       10484
  1028          0           0       1021         431 &amp;lt;------
  2048          0           0       1149        1150
  2052          0           0         62          88 &amp;lt;------
  3072          0           0        231         230 &amp;lt;------
  3076          0           0         11          14
  4092          0           0         86          69
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;
# lctl set_param obdfilter.*.readcache_max_filesize=18446744073709551615
obdfilter.lustre-OST0000.readcache_max_filesize=18446744073709551615

# nobjlo=2 nobjhi=2 thrlo=128 thrhi=128 tests_str=&quot;write read&quot; /usr/bin/obdfilter-survey
Sat Jun 18 21:17:25 JST 2011 Obdfilter-survey for case=disk from r01
ost  1 sz 16777216K rsz 1024K obj    2 thr  128 write  616.80 [ 469.54, 772.31] read  827.44 [ 767.28, 858.09] 
done!

# cat /proc/fs/lustre/obdfilter/lustre-OST0000/brw_stats 
snapshot_time:         1308399550.63654 (secs.usecs)

                           read      |     write
pages per bulk r/w     rpcs  % cum % |  rpcs  % cum %
256:		     16317 100 100   | 16384 100 100

                           read      |     write
discontiguous pages    rpcs  % cum % |  rpcs  % cum %
0:		     16317 100 100   | 16384 100 100

                           read      |     write
discontiguous blocks   rpcs  % cum % |  rpcs  % cum %
0:		     16317 100 100   | 16384 100 100

                           read      |     write
disk fragmented I/Os   ios   % cum % |  ios   % cum %
1:		     16203  99  99   | 16364  99  99
2:		       114   0 100   |   20   0 100

                           read      |     write
disk I/Os in flight    ios   % cum % |  ios   % cum %
1:		         4   0   0   |    4   0   0
2:		         2   0   0   |    5   0   0
3:		         1   0   0   |    4   0   0
4:		         1   0   0   |    6   0   0
5:		         1   0   0   |    3   0   0
6:		         1   0   0   |    5   0   0
7:		         3   0   0   |    2   0   0
8:		         7   0   0   |    6   0   0
9:		         4   0   0   |    4   0   0
10:		         3   0   0   |    3   0   0
11:		         2   0   0   |    7   0   0
12:		         3   0   0   |    5   0   0
13:		         4   0   0   |    6   0   0
14:		         3   0   0   |    8   0   0
15:		         3   0   0   |   11   0   0
16:		         6   0   0   |   19   0   0
17:		         6   0   0   |   26   0   0
18:		         3   0   0   |   30   0   0
19:		         3   0   0   |   43   0   1
20:		         3   0   0   |   52   0   1
21:		         4   0   0   |   54   0   1
22:		         2   0   0   |   51   0   2
23:		         2   0   0   |   53   0   2
24:		         4   0   0   |   69   0   2
25:		         2   0   0   |   75   0   3
26:		         4   0   0   |   65   0   3
27:		         8   0   0   |   74   0   4
28:		         3   0   0   |   87   0   4
29:		         3   0   0   |   94   0   5
30:		         8   0   0   |   94   0   5
31:		     16328  99 100   | 15439  94 100

                           read      |     write
I/O time (1/1000s)     ios   % cum % |  ios   % cum %
1:		         0   0   0   |    7   0   0
2:		         0   0   0   |    1   0   0
4:		         3   0   0   |    0   0   0
8:		         1   0   0   |    7   0   0
16:		        22   0   0   |   14   0   0
32:		        71   0   0   |  117   0   0
64:		       993   6   6   | 1362   8   9
128:		      4603  28  34   | 8467  51  60
256:		      9657  59  94   | 6227  38  98
512:		       927   5  99   |  182   1 100
1K:		        40   0 100   |    0   0 100

                           read      |     write
disk I/O size          ios   % cum % |  ios   % cum %
4K:		       114   0   0   |   20   0   0
8K:		         0   0   0   |    0   0   0
16K:		         0   0   0   |    0   0   0
32K:		         0   0   0   |    0   0   0
64K:		         0   0   0   |    0   0   0
128K:		         0   0   0   |    0   0   0
256K:		         0   0   0   |    0   0   0
512K:		         0   0   0   |    0   0   0
1M:		     16317  99 100   | 16384  99 100


SFA statistics

----------------------------------------------------
Length           Port 0                 Port 1                 
Kbytes      Reads      Writes      Reads      Writes
----------------------------------------------------
     4          0           0         42         823
     8          0           0          0           5
    12          0           0          0           1
    16          0           0          0           1
    20          0           0          0           9
    24          0           0          0          71
    28          0           0          0         309
    32          0           0          0         234
    36          0           0          0          25
    40          0           0          0           2
    44          0           0          0           1
    52          0           0          0           1
    56          0           0          0           1
    60          0           0          0           1
  1020          0           0        114          20 &amp;lt;------ only 20 times
  1024          0           0       2148        3246 
  1028          0           0         45           3
  2048          0           0       1062        1004
  2052          0           0         11           0
  3072          0           0        821         583
  3076          0           0          5           1
  4092          0           0       1686        1519
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="16566" author="di.wang" created="Sun, 19 Jun 2011 18:39:44 +0000"  >&lt;p&gt;Thanks for the information. It seems IO is fragmented somehow with max_filesize=8388608. Usually it means the extent allocation is not contiguous on disk. (mballoc does not perform well in this case).&lt;/p&gt;

&lt;p&gt;                           read      |     write&lt;br/&gt;
disk fragmented I/Os   ios   % cum % |  ios   % cum %&lt;br/&gt;
1:		     14073  85  85   | 14431  88  88&lt;br/&gt;
2:		      2297  14 100   | 1953  11 100&lt;/p&gt;

&lt;p&gt;Hmm, I do not understand why it is related with max_filesize. I will investigate deeper to see what is going on here. Thanks.&lt;/p&gt;</comment>
                            <comment id="16651" author="di.wang" created="Mon, 20 Jun 2011 23:25:50 +0000"  >&lt;p&gt;It seems the problem is that MAX_HW_SEGMENTS of DDN SFA 10K &amp;lt; 256 ?  It is usually 128 for most device, lustre actually had a patch for kernel to change that value blkdev_tunables-2.6-rhel5.patch. unfortunately, that patch does not work for all of the device. Could you please check what is your max_hw_segments setting in your test.&lt;/p&gt;

&lt;p&gt;If it is &amp;lt; 256, then lustre server might create some fragmented IO here, especially when pages are fragmented, i.e. they are not physical contiguous. (For example if max_hw_segment is 128, to create 1M IO (256 pages), you would expect half of 256 pages are contiguous, otherwise it will split the IO into 2 small size IO).      &lt;/p&gt;

&lt;p&gt;This also explain why you see less fragmented IO with big readcache, because these pages are not being create/release frequently, so it will have less chance to fragment pages, i.e. pages are more physically contiguous in this case. &lt;/p&gt;</comment>
                            <comment id="16655" author="di.wang" created="Tue, 21 Jun 2011 01:56:35 +0000"  >&lt;p&gt;To prove this idea I did a test in a sata drive. Default hw_max_segments = 128&lt;/p&gt;


&lt;p&gt;with read_cache_size = 18446744073709551615&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@testnode obdfilter-survey&amp;#93;&lt;/span&gt;# cat /proc/fs/lustre/obdfilter/lustre-OST0001/readcache_max_filesize &lt;br/&gt;
18446744073709551615&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@testnode obdfilter-survey&amp;#93;&lt;/span&gt;# nobjlo=2 nobjhi=2 thrlo=128 thrhi=128 tests_str=&quot;write read&quot; ./obdfilter-survey &lt;br/&gt;
Mon Jun 20 08:03:19 MST 2011 Obdfilter-survey for case=disk from testnode&lt;br/&gt;
ost  1 sz 16777216K rsz 1024K obj    2 thr  128 write   82.67 [  72.87,  90.77] read   80.89 [  73.86,  87.93] &lt;br/&gt;
done!&lt;/p&gt;

&lt;p&gt;root@testnode obdfilter-survey]# cat /proc/fs/lustre/obdfilter/lustre-OST0001/brw_stats&lt;/p&gt;

&lt;p&gt;.....&lt;/p&gt;

&lt;p&gt;disk fragmented I/Os   ios   % cum % |  ios   % cum %&lt;br/&gt;
0:		        17   0   0   |    0   0   0&lt;br/&gt;
1:		     14950  91  91   | 15958  97  97&lt;br/&gt;
2:		      1385   8 100   |  426   2 100&lt;br/&gt;
......&lt;/p&gt;

&lt;p&gt;With read_cache_size = 8388608&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@testnode obdfilter-survey&amp;#93;&lt;/span&gt;# cat /proc/fs/lustre/obdfilter/lustre-OST0001/readcache_max_filesize &lt;br/&gt;
8388608&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@testnode obdfilter-survey&amp;#93;&lt;/span&gt;# nobjlo=2 nobjhi=2 thrlo=128 thrhi=128 tests_str=&quot;write read&quot; ./obdfilter-survey &lt;br/&gt;
Mon Jun 20 07:37:39 MST 2011 Obdfilter-survey for case=disk from testnode&lt;br/&gt;
ost  1 sz 16777216K rsz 1024K obj    2 thr  128 write   72.44             SHORT read   78.38 [  56.95,  83.93] &lt;br/&gt;
done!&lt;/p&gt;

&lt;p&gt;......&lt;br/&gt;
                           read      |     write&lt;br/&gt;
disk fragmented I/Os   ios   % cum % |  ios   % cum %&lt;br/&gt;
0:		         3   0   0   |    0   0   0&lt;br/&gt;
1:		      7108  91  91   | 10630  64  64&lt;br/&gt;
2:		       663   8 100   | 5754  35 100&lt;/p&gt;

&lt;p&gt;......&lt;/p&gt;

&lt;p&gt;Then  I applied a patch to change default hw_max_segments to be 256&lt;br/&gt;
&amp;#8212; include/linux/ata.h.old     2011-06-21 06:44:26.000000000 -0700&lt;br/&gt;
+++ include/linux/ata.h 2011-06-21 05:40:11.000000000 -0700&lt;br/&gt;
@@ -38,7 +38,8 @@&lt;br/&gt;
 enum {&lt;br/&gt;
        /* various global constants */&lt;br/&gt;
        ATA_MAX_DEVICES         = 2,    /* per bus/port */&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;ATA_MAX_PRD             = 256,  /* we could make these 256/256 */&lt;br/&gt;
+       //ATA_MAX_PRD           = 256,  /* we could make these 256/256 */&lt;br/&gt;
+       ATA_MAX_PRD             = 512,  /* we could make these 256/256 */&lt;br/&gt;
        ATA_SECT_SIZE           = 512,&lt;br/&gt;
        ATA_MAX_SECTORS_128     = 128,&lt;br/&gt;
        ATA_MAX_SECTORS         = 256,&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Then redo the test again &lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@testnode lustre&amp;#93;&lt;/span&gt;# cat /proc/fs/lustre/obdfilter/lustre-OST0000/readcache_max_filesize &lt;br/&gt;
8388608&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@testnode obdfilter-survey&amp;#93;&lt;/span&gt;# &lt;br/&gt;
Tue Jun 21 06:10:26 MST 2011 Obdfilter-survey for case=disk from testnode&lt;br/&gt;
ost  1 sz 16777216K rsz 1024K obj    2 thr  128 write &lt;br/&gt;
  81.05             SHORT read   81.21 [  68.88,  88.93] &lt;br/&gt;
done!&lt;/p&gt;

&lt;p&gt;root@testnode lustre]# cat /proc/fs/lustre/obdfilter/lustre-OST0000/brw_stats&lt;/p&gt;

&lt;p&gt;......&lt;br/&gt;
                           read      |     write&lt;br/&gt;
disk fragmented I/Os   ios   % cum % |  ios   % cum %&lt;br/&gt;
0:		         3   0   0   |    0   0   0&lt;br/&gt;
1:		     16378  99 100   | 16384 100 100&lt;br/&gt;
.......&lt;/p&gt;





</comment>
                            <comment id="16668" author="ihara" created="Tue, 21 Jun 2011 10:46:18 +0000"  >&lt;p&gt;That&apos;s what I was also thinking. The current OSS and SFA connection are SRP (SCSI over RDMA Protocol) on QDR. The maiximum number of gather/scatter entries per I/O in SRP is 255 (default is 12). we can set this paramter with srp_sg_tablesize in ib_srp module, but it only can up to 255 descriptors. This means, in order to send 1M I/O to SFA10K, OSS sends two requests to SFA - one is 1020K (4k x 255 descriptors) and another one is 4K.&lt;/p&gt;

&lt;p&gt;My unserstanding, nomally, SFA10K gets two requests 1020K + 4K as single I/O reuqest, handles it as full stripe I/O, but if OSS&apos;s memory are much used, these two requests are fragmented - sent them to SFA10K, then handles as different I/O requests. In this situation, we saw many 1020K IO and 4K requests on the SFA10K.&lt;/p&gt;

&lt;p&gt;In order to prevent this situation, we are setting vm.min_free_kbytes to keep memory space to avoid fragment two requests, but it&apos;s not perfect.&lt;/p&gt;

&lt;p&gt;However, it didn&apos;t help for this issue which we see when readcache_max_filesize=8388608. Anyway, two issues might be related and caused by due to srp_sg_tablesize=255.&lt;/p&gt;

&lt;p&gt;I will try same testing on SFA10K with 8Gbps FC(Fiber Chanel) which can support sending actaul 1M I/O (scatter/gather 256 requests), then let&apos;s see what happens.&lt;/p&gt;
</comment>
                            <comment id="16702" author="ezell" created="Tue, 21 Jun 2011 14:38:31 +0000"  >&lt;p&gt;Was this on a RedHat 5 OSS? Linux kernel 2.6.24 merged in SG list chaining which could help the situation. Do you have access to RedHat 6 to test?&lt;/p&gt;</comment>
                            <comment id="16706" author="cliffw" created="Tue, 21 Jun 2011 17:12:34 +0000"  >&lt;p&gt;I think I can now confirm this on Hyperion.&lt;/p&gt;

&lt;p&gt;With # hyperion1154 /root &amp;gt; cat /proc/fs/lustre/obdfilter/lustre-OST0000/readcache_max_filesize  &lt;br/&gt;
18446744073709551615&lt;/p&gt;


&lt;p&gt;0000: Operation  Max (MiB)  Min (MiB)  Mean (MiB)   Std Dev  Max (OPs)  Min (OPs)  Mean (OPs)   Std Dev  Mean (s)&lt;br/&gt;
0000: ---------  ---------  ---------  ----------   -------  ---------  ---------  ----------   -------   -------&lt;br/&gt;
0000: write        1488.54    1466.58     1479.54      9.39     1488.54    1466.58     1479.54      9.39  89.2852&lt;br/&gt;
0000: read         1221.81    1201.83     1210.44      8.39     1221.81    1201.83     1210.44      8.39 109.1356&lt;br/&gt;
0000: &lt;br/&gt;
0000: Max Write: 1488.54 MiB/sec (1560.85 MB/sec)&lt;br/&gt;
0000: Max Read:  1221.81 MiB/sec (1281.16 MB/sec)&lt;br/&gt;
0000: &lt;br/&gt;
0000: Run finished: Tue Jun 21 13:48:15 2011&lt;/p&gt;

&lt;p&gt;Previous with 8M readahead_max_filesize&lt;/p&gt;

&lt;p&gt;0000: Operation  Max (MiB)  Min (MiB)  Mean (MiB)   Std Dev  Max (OPs)  Min (OPs)  Mean (OPs)   Std Dev  Mean (s)Op grep #Tasks tPN reps  fPP reord&lt;br/&gt;
0000: ---------  ---------  ---------  ----------   -------  ---------  ---------  ----------   -------   -------&lt;br/&gt;
0000: write         955.17     944.80      950.17      4.24      955.17     944.80      950.17      4.24 139.02702   1032 8 3 1 1 1 0 0 1 134217728&lt;br/&gt;
0000: read          893.92     876.00      881.99      8.44      893.92     876.00      881.99      8.44 149.78426   1032 8 3 1 1 1 0 0 1 134217728&lt;br/&gt;
0000: &lt;br/&gt;
0000: Max Write: 955.17 MiB/sec (1001.57 MB/sec)&lt;br/&gt;
0000: Max Read:  893.92 MiB/sec (937.34 MB/sec)&lt;br/&gt;
0000: &lt;br/&gt;
0000: Run finished: Sat Jun 18 21:41:28 2011&lt;/p&gt;</comment>
                            <comment id="16725" author="cliffw" created="Tue, 21 Jun 2011 20:09:41 +0000"  >&lt;p&gt;Ah, that is the case on hyperion-&lt;/p&gt;

&lt;p&gt;options ib_srp srp_sg_tablesize=255&lt;/p&gt;

&lt;p&gt;is set.&lt;/p&gt;</comment>
                            <comment id="16756" author="ihara" created="Wed, 22 Jun 2011 12:16:30 +0000"  >&lt;p&gt;Just tested on SFA10K(FC model). there were no fragmentaion.&lt;br/&gt;
So, if we use SRP between SFA10K and OSS, at least, we should have large reaccache_max_filesize until SFA and SRP initiator supports FMR to send/read the large I/O with single request. The my unerstading is that this development and improvement are progressing in DDN and ORNL, so this should be supported very soon.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lctl get_param obdfilter.*.readcache_max_filesize
obdfilter.lustre-OST0000.readcache_max_filesize=8388608

# nobjlo=2 nobjhi=2 thrlo=128 thrhi=128 tests_str=&quot;write read&quot; /usr/bin/obdfilter-survey
Thu Jun 23 00:54:32 JST 2011 Obdfilter-survey for case=disk from r13
ost  1 sz 16777216K rsz 1024K obj    2 thr  128 write  613.75 [ 534.49, 763.64] read  690.07 [ 630.40, 732.31] 
done!

# cat /proc/fs/lustre/obdfilter/lustre-OST0000/brw_stats 
snapshot_time:         1308758136.955857 (secs.usecs)

                           read      |     write
pages per bulk r/w     rpcs  % cum % |  rpcs  % cum %
16:		         1   0   0   |    0   0   0
32:		         3   0   0   |    0   0   0
64:		         3   0   0   |    0   0   0
128:		         2   0   0   |    0   0   0
256:		     16372  99 100   | 16384 100 100

                           read      |     write
discontiguous pages    rpcs  % cum % |  rpcs  % cum %
0:		     16381 100 100   | 16384 100 100

                           read      |     write
discontiguous blocks   rpcs  % cum % |  rpcs  % cum %
0:		     16381 100 100   | 16384 100 100

                           read      |     write
disk fragmented I/Os   ios   % cum % |  ios   % cum %
1:		     16381 100 100   | 16384 100 100

                           read      |     write
disk I/Os in flight    ios   % cum % |  ios   % cum %
1:		         2   0   0   |    2   0   0
2:		         2   0   0   |    2   0   0
3:		         1   0   0   |    2   0   0
4:		         1   0   0   |    1   0   0
5:		         3   0   0   |    3   0   0
6:		         2   0   0   |    3   0   0
7:		         1   0   0   |    3   0   0
8:		         1   0   0   |    3   0   0
9:		         1   0   0   |    5   0   0
10:		         2   0   0   |    9   0   0
11:		         1   0   0   |    7   0   0
12:		         1   0   0   |    7   0   0
13:		         1   0   0   |    5   0   0
14:		         2   0   0   |    4   0   0
15:		         1   0   0   |    7   0   0
16:		         1   0   0   |    9   0   0
17:		         2   0   0   |   10   0   0
18:		         1   0   0   |   29   0   0
19:		         1   0   0   |  296   1   2
20:		         3   0   0   |  175   1   3
21:		         2   0   0   |   10   0   3
22:		         1   0   0   |    7   0   3
23:		         1   0   0   |   14   0   3
24:		         2   0   0   |    5   0   3
25:		         1   0   0   |    8   0   3
26:		         2   0   0   |    7   0   3
27:		         1   0   0   |   14   0   3
28:		         3   0   0   |   10   0   4
29:		         5   0   0   |   11   0   4
30:		         1   0   0   |    5   0   4
31:		     16332  99 100   | 15711  95 100

                           read      |     write
I/O time (1/1000s)     ios   % cum % |  ios   % cum %
4:		         1   0   0   |    0   0   0
8:		         2   0   0   |    0   0   0
16:		         4   0   0   |   22   0   0
32:		        31   0   0   |  528   3   3
64:		        80   0   0   | 4734  28  32
128:		     11400  69  70   | 4779  29  61
256:		      1685  10  80   | 5893  35  97
512:		       794   4  85   |  225   1  98
1K:		      2384  14 100   |   25   0  98
2K:		         0   0 100   |    0   0  98
4K:		         0   0 100   |   83   0  99
8K:		         0   0 100   |   95   0 100

                           read      |     write
disk I/O size          ios   % cum % |  ios   % cum %
64K:		         1   0   0   |    0   0   0
128K:		         3   0   0   |    0   0   0
256K:		         3   0   0   |    0   0   0
512K:		         2   0   0   |    0   0   0
1M:		     16372  99 100   | 16384 100 100

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;WangDi, &lt;br/&gt;
no impact for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; even if we change readcache_max_filesize to 18446744073709551615 from 1.8.6&apos;s default 8388608?&lt;/p&gt;</comment>
                            <comment id="16773" author="di.wang" created="Wed, 22 Jun 2011 13:55:18 +0000"  >&lt;p&gt;There might be some impacts. The reason we shrink the readcache_max_filesize is that if the OSS cache so much pages, some META information (for example group information) might be swapped out from the memory frequently, which is very bad for new extents allocation especially when the OST is becoming full. But you can always tell customers to shrink this value, if they see this issue. (Please check &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; for details). Otherwise just keep big size of readcache_max_filesize might be temporary solution for this. &lt;/p&gt;</comment>
                            <comment id="16776" author="hudson" created="Wed, 22 Jun 2011 14:09:49 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=i686,build_type=client,distro=el5,ib_stack=inkernel/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; i686,client,el5,inkernel #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16778" author="hudson" created="Wed, 22 Jun 2011 14:10:23 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=x86_64,build_type=client,distro=el6,ib_stack=inkernel/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; x86_64,client,el6,inkernel #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16780" author="hudson" created="Wed, 22 Jun 2011 14:11:33 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=x86_64,build_type=server,distro=el5,ib_stack=inkernel/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; x86_64,server,el5,inkernel #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16782" author="hudson" created="Wed, 22 Jun 2011 14:13:09 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=x86_64,build_type=client,distro=ubuntu1004,ib_stack=inkernel/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; x86_64,client,ubuntu1004,inkernel #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16784" author="hudson" created="Wed, 22 Jun 2011 14:13:49 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=i686,build_type=server,distro=el5,ib_stack=ofa/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; i686,server,el5,ofa #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16788" author="hudson" created="Wed, 22 Jun 2011 14:21:13 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=x86_64,build_type=client,distro=el5,ib_stack=inkernel/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; x86_64,client,el5,inkernel #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16790" author="hudson" created="Wed, 22 Jun 2011 14:21:39 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=i686,build_type=client,distro=el5,ib_stack=ofa/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; i686,client,el5,ofa #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16792" author="hudson" created="Wed, 22 Jun 2011 14:27:14 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=i686,build_type=client,distro=el6,ib_stack=inkernel/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; i686,client,el6,inkernel #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16794" author="hudson" created="Wed, 22 Jun 2011 14:28:55 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=i686,build_type=server,distro=el5,ib_stack=inkernel/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; i686,server,el5,inkernel #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16796" author="hudson" created="Wed, 22 Jun 2011 14:31:30 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=x86_64,build_type=client,distro=el5,ib_stack=ofa/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; x86_64,client,el5,ofa #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16799" author="hudson" created="Wed, 22 Jun 2011 14:52:34 +0000"  >&lt;p&gt;Integrated in &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;http://newbuild.whamcloud.com/images/16x16/blue.png&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;a href=&quot;http://newbuild.whamcloud.com/job/lustre-b1_8/./arch=x86_64,build_type=server,distro=el5,ib_stack=ofa/90/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lustre-b1_8 &#187; x86_64,server,el5,ofa #90&lt;/a&gt;&lt;br/&gt;
     &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-410&quot; title=&quot;Performance concern with Shrink file_max_cache_size to alleviate the memory pressure of OST patch for LU-15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-410&quot;&gt;&lt;del&gt;LU-410&lt;/del&gt;&lt;/a&gt; Revert &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; slow IO with read intense application&lt;/p&gt;

&lt;p&gt;Johann Lombardi : &lt;a href=&quot;http://git.whamcloud.com/gitweb?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b1_8&amp;amp;a=commit&amp;amp;h=ec54d726360ddd09f3fa7489535bdbf9875e4306&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ec54d726360ddd09f3fa7489535bdbf9875e4306&lt;/a&gt;&lt;br/&gt;
Files : &lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lustre/ChangeLog&lt;/li&gt;
	&lt;li&gt;lustre/obdfilter/filter_internal.h&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="16801" author="spitzcor" created="Wed, 22 Jun 2011 14:54:30 +0000"  >&lt;p&gt;Di Wang wrote:&lt;br/&gt;
&quot;This also explain why you see less fragmented IO with big readcache, because these pages are not being create/release frequently, so it will have less chance to fragment pages, i.e. pages are more physically contiguous in this case.&quot;&lt;/p&gt;

&lt;p&gt;So, is the theory that the call into the kernel to truncate_inode_pages_range(), which releases pages one at a time, causes memory to become quickly fragmented?  If true, then setting the readcache_max_filesize=1GiB and running the same testcase should hopefully result in more 1MiB I/Os from the SRP initiator.  Can we easily prove that the memory an OSS acquires for bulk read/write data is less physically fragmented when readcache_max_filesize=-1?&lt;/p&gt;</comment>
                            <comment id="16803" author="di.wang" created="Wed, 22 Jun 2011 15:51:07 +0000"  >&lt;p&gt;Yes, get/release(like truncate_inode_page_range) page frequently will get memory fragmented. Actually, I had hoped to find a API to allocate contiguous pages for bulk read/write, but seems there are no such API. &lt;/p&gt;

&lt;p&gt;Yes, If you set readcache_max_filesize =1G, you should expect more 1MB IO. (Though the test will write 16G to each object). So yes, if you set readcache_max_filesize=-1, you should expect less fragmented pages, IMHO.&lt;/p&gt;</comment>
                            <comment id="16804" author="spitzcor" created="Wed, 22 Jun 2011 15:57:08 +0000"  >&lt;p&gt;From the description John Salinas wrote:&lt;br/&gt;
&quot;Disabling the read cache (lctl set_param=obdfilter.*.read_cache_enable=0) doesn&apos;t help which is still very strange to me.&quot;&lt;/p&gt;

&lt;p&gt;Did you disable the writethrough_cache as well?  If you kept the writethrough cache enabled and performed writes then they would still stay in the cache as filter_release_cache() wouldn&apos;t be called before returning from filter_commitrw_write().&lt;/p&gt;</comment>
                            <comment id="16808" author="cliffw" created="Wed, 22 Jun 2011 17:09:52 +0000"  >&lt;p&gt;Tested -rc3 on hyperion, looks better&lt;/p&gt;

&lt;p&gt;0000: Operation  Max (MiB)  Min (MiB)  Mean (MiB)   Std Dev  Max (OPs)  Min (OPs)  Mean (OPs)   Std Dev  Mean (s)Op grep #Tasks tPN reps  fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize&lt;br/&gt;
0000: ---------  ---------  ---------  ----------   -------  ---------  ---------  ----------   -------   -------&lt;br/&gt;
0000: write        1372.40    1340.43     1359.45     13.74     1372.40    1340.43     1359.45     13.74  97.17835   1032 8 3 1 1 1 0 0 1 134217728 1048576 138512695296 -1 POSIX EXCEL&lt;br/&gt;
0000: read         1161.01    1055.91     1092.53     48.46     1161.01    1055.91     1092.53     48.46 121.13945   1032 8 3 1 1 1 0 0 1 134217728 1048576 138512695296 -1 POSIX EXCEL&lt;br/&gt;
0000:&lt;br/&gt;
0000: Max Write: 1372.40 MiB/sec (1439.07 MB/sec)&lt;br/&gt;
0000: Max Read:  1161.01 MiB/sec (1217.41 MB/sec)&lt;br/&gt;
0000:&lt;br/&gt;
0000: Run finished: Wed Jun 22 13:47:55 2011&lt;br/&gt;
Full system file per process MPIIO IOR&lt;br/&gt;
&amp;#8211;&lt;br/&gt;
0000: Operation  Max (MiB)  Min (MiB)  Mean (MiB)   Std Dev  Max (OPs)  Min (OPs)  Mean (OPs)   Std Dev  Mean (s)Op grep #Tasks tPN reps  fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize&lt;br/&gt;
0000: ---------  ---------  ---------  ----------   -------  ---------  ---------  ----------   -------   -------&lt;br/&gt;
0000: write        1331.43    1317.12     1322.00      6.67     1331.43    1317.12     1322.00      6.67  99.92368   1032 8 3 0 1 1 0 0 1 134217728 1048576 138512695296 -1 POSIX EXCEL&lt;br/&gt;
0000: read         1029.69    1024.16     1026.19      2.48     1029.69    1024.16     1026.19      2.48 128.72528   1032 8 3 0 1 1 0 0 1 134217728 1048576 138512695296 -1 POSIX EXCEL&lt;br/&gt;
0000:&lt;br/&gt;
0000: Max Write: 1331.43 MiB/sec (1396.11 MB/sec)&lt;br/&gt;
0000: Max Read:  1029.69 MiB/sec (1079.70 MB/sec)&lt;/p&gt;</comment>
                            <comment id="20008" author="adilger" created="Wed, 7 Sep 2011 17:07:07 +0000"  >&lt;p&gt;Di, I had occasion to look at this bug again, and one idea I had was to try and allocate order-1 pages (i.e. 8kB chunks) until that is failing and only then fall back to order-0 (4kB) allocations?  Even getting a single 8kB allocation per IO would be enough to avoid page fragmentation to overflow the 255-segment limit for SRP.&lt;/p&gt;

&lt;p&gt;Also, it would be interesting to watch the page allocation statistics on a system that is suffering from this problem to see if there are many 8kB pages available, and the only reason that fragmented 4kB pages are being used is because they are no longer being pinned by the read cache for a long time.&lt;/p&gt;</comment>
                            <comment id="20029" author="di.wang" created="Wed, 7 Sep 2011 20:53:37 +0000"  >&lt;p&gt;Ah, this is a good idea, I will cook a patch then. From what I see, there are almost no contiguous pages at that time. I will try to get page allocation statistics with the patch.&lt;/p&gt;</comment>
                            <comment id="20203" author="di.wang" created="Wed, 14 Sep 2011 01:42:55 +0000"  >&lt;p&gt;Andreas, I just cooked a patch to use alloc_pages to allocate order-1 pages for niobuf, i.e. try to allocate 2 contiguous pages each time in filter_preprw_read/write. It indeed helped to avoid fragmented IO here.&lt;/p&gt;

&lt;p&gt;Here are two results from obdfilter_survey, in both cases the backend max_hw_segments = 128, (&amp;lt;256)&lt;/p&gt;

&lt;p&gt;1. Without the patch, &lt;br/&gt;
Wed Sep 14 02:20:12 MST 2011 Obdfilter-survey for case=disk from testnode&lt;br/&gt;
ost  1 sz 16777216K rsz 1024K obj    2 thr  128 write   89.10 [  74.92,  93.92] read   83.69 [  71.94,  92.92] &lt;/p&gt;


&lt;p&gt;brw_stats&lt;br/&gt;
....&lt;br/&gt;
                           read      |     write&lt;br/&gt;
disk fragmented I/Os   ios   % cum % |  ios   % cum %&lt;br/&gt;
0:                       3   0   0   |    0   0   0&lt;br/&gt;
1:                    9557  58  58   | 11139  67  67&lt;br/&gt;
2:                    6817  41 100   | 5245  32 100&lt;br/&gt;
................&lt;/p&gt;

&lt;p&gt;2. with the patch&lt;/p&gt;

&lt;p&gt;Wed Sep 14 03:11:24 MST 2011 Obdfilter-survey for case=disk from testnode&lt;br/&gt;
ost  1 sz 16777216K rsz 1024K obj    2 thr  128 write   89.58 [  80.93,  93.83] read   86.26 [  76.94,  91.92] &lt;/p&gt;

&lt;p&gt;brw_stats&lt;br/&gt;
........&lt;br/&gt;
                           read      |     write&lt;br/&gt;
disk fragmented I/Os   ios   % cum % |  ios   % cum %&lt;br/&gt;
0:                       3   0   0   |    0   0   0&lt;br/&gt;
1:                   15739  96  96   | 15967  97  97&lt;br/&gt;
2:                     641   3 100   |  417   2 100&lt;br/&gt;
........&lt;/p&gt;


&lt;p&gt;Though the performance does not improve a lot, it did help to avoid the fragmented IO.&lt;/p&gt;

&lt;p&gt;I post the patch here &lt;a href=&quot;http://review.whamcloud.com/#change,1377&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,1377&lt;/a&gt; , but the implementation might be a little hacky. Since the whole data stack on server side is stick with 1 page. So even though we will alloc 2 contigous pages(8k) each time(in filter_preprw_read/write), we still need handle single page individually in other functions. But kernel seems only somewhat initialize the &quot;first&quot; page in alloc_pages(order &amp;gt;= 1), so we have to initialize the following pages ourselves(for example _count, flags), then we can add all pages to the cache. And also the patch needs to export a kernel api add_to_page_cache_lru.&lt;/p&gt;







</comment>
                            <comment id="22554" author="ihara" created="Sun, 6 Nov 2011 01:00:31 +0000"  >&lt;p&gt;Di,&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/#change,1377&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,1377&lt;/a&gt;, Would you plesae make a patch for 2.x for testing? I&apos;m seeing more non-alined IOs with lustre-2.1&apos;s on RHEL6 even readcache_max_filesize=18446744073709551615 and vm.min_free_kbyte=2097152 which was one of workaround RHEL5.x.&lt;/p&gt;

&lt;p&gt;I&apos;m still having at look at current behavior on RHEL6, but just want to try your patch with lustre-2.1 if we see any defiferences.&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;</comment>
                            <comment id="25031" author="di.wang" created="Tue, 20 Dec 2011 18:46:19 +0000"  >&lt;p&gt;Sorry, Ihara&lt;/p&gt;

&lt;p&gt;Just saw this message. Yes, I am working on the patch for 2.x now. I was wondering any difference between RHEL6 and RHEL5 on this area.&lt;/p&gt;</comment>
                            <comment id="25111" author="ihara" created="Wed, 21 Dec 2011 21:00:28 +0000"  >&lt;p&gt;WangDi,&lt;/p&gt;

&lt;p&gt;did you submit the patches? I wonder if I could test them with 2.x.&lt;/p&gt;</comment>
                            <comment id="25112" author="di.wang" created="Wed, 21 Dec 2011 21:03:10 +0000"  >&lt;p&gt;oh, no yet. I am working on it now. I will  let you know once the patch is ready.&lt;/p&gt;</comment>
                            <comment id="25302" author="di.wang" created="Fri, 30 Dec 2011 19:07:22 +0000"  >&lt;blockquote&gt;
&lt;p&gt;did you submit the patches? I wonder if I could test them with 2.x.&lt;/p&gt;&lt;/blockquote&gt; 

&lt;p&gt;Please try this &lt;a href=&quot;http://review.whamcloud.com/#change,1881&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,1881&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="43020" author="kitwestneat" created="Fri, 10 Aug 2012 11:22:14 +0000"  >&lt;p&gt;We haven&apos;t seen this issue again, and probably won&apos;t have time to do any testing, so this one can be closed. &lt;/p&gt;</comment>
                            <comment id="43021" author="pjones" created="Fri, 10 Aug 2012 11:30:23 +0000"  >&lt;p&gt;ok thanks Kit!&lt;/p&gt;</comment>
                            <comment id="43029" author="spitzcor" created="Fri, 10 Aug 2012 12:17:11 +0000"  >&lt;p&gt;Kit, is that because your SRP initiator can construct and send I/O w/256 fragments?&lt;/p&gt;</comment>
                            <comment id="43031" author="kitwestneat" created="Fri, 10 Aug 2012 12:32:56 +0000"  >&lt;p&gt;Cory, Ihara said that after reverting the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; patch, he hasn&apos;t seen it.&lt;/p&gt;</comment>
                            <comment id="43032" author="spitzcor" created="Fri, 10 Aug 2012 13:00:06 +0000"  >&lt;p&gt;Ah, then there really is still an issue then, right?  At minimum, one cannot configure the cache&apos;s max file size lower to reduce cache wasting w/o re-introducing fragmented I/O.  There were multiple fixes for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; though.  Maybe this is a question best suited as a comment to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt;, but is the ldiskfs metadata eviction still a concern if readcache_max_filesize is not reduced?  That is, were the other &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; changes sufficient to resolve that issue?  I would suppose that it is because &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt; is closed, but I would like to make sure.&lt;/p&gt;</comment>
                            <comment id="43467" author="ihara" created="Sat, 18 Aug 2012 09:26:22 +0000"  >&lt;p&gt;no more fragments on new OFED-3.x and RHEL6 based OFED since ib_srp supprots indirect_sg_entries.&lt;/p&gt;</comment>
                            <comment id="47552" author="mhanafi" created="Wed, 7 Nov 2012 20:41:34 +0000"  >&lt;p&gt;Shuichi Ihara could you please post you srp module options?&lt;br/&gt;
what values do you have for indirect_sg_entries, cmd_sg_entries, allow_ext_sg.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Mahmoud&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="12655">LU-918</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="55163">LU-12071</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10040" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic</customfieldname>
                        <customfieldvalues>
                                        <label>performance</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvsnz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>8548</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>