<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:21:31 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2001] read operation is slow when mmap is enabled</title>
                <link>https://jira.whamcloud.com/browse/LU-2001</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;There is an application and running on the lustre. &lt;/p&gt;

&lt;p&gt;When we run this application trough the lustre client, it&apos;s really slow and application seems to be stall.&lt;br/&gt;
However, if the lustre exports lustre filesystem with NFS and mount it client itself, then application does IOs to lustre through the NFS, the performance seems to be reasonable.&lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;mount -t lustre&lt;br/&gt;
192.168.100.131@tcp:/lustre on /lustre type lustre (rw)&lt;/li&gt;
	&lt;li&gt;exportfs&lt;br/&gt;
/lustre       	&amp;lt;world&amp;gt;&lt;/li&gt;
	&lt;li&gt;mount -t nfs&lt;br/&gt;
localhost:/lustre on /lustre-nfs type nfs (rw,addr=127.0.0.1)&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;The application is doing the following operations.&lt;br/&gt;
1. 4 jobs (same application) are running on the single client and these jobs read single shared file as input file. file is open with open() and mmap().&lt;br/&gt;
2. computes with them and each job write output file separately. the block size 8k for read() and write().&lt;/p&gt;

&lt;p&gt;Any ideas why application is fast if it goes through the NFS layer to lustre?&lt;/p&gt;</description>
                <environment>lustre-1.8.8 CentOS5.8</environment>
        <key id="16059">LU-2001</key>
            <summary>read operation is slow when mmap is enabled</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="6" iconUrl="https://jira.whamcloud.com/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="2">Won&apos;t Fix</resolution>
                                        <assignee username="jay">Jinshan Xiong</assignee>
                                    <reporter username="ihara">Shuichi Ihara</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Fri, 21 Sep 2012 07:11:00 +0000</created>
                <updated>Thu, 8 Feb 2018 18:26:39 +0000</updated>
                            <resolved>Thu, 8 Feb 2018 18:26:39 +0000</resolved>
                                    <version>Lustre 1.8.8</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>10</watches>
                                                                            <comments>
                            <comment id="45335" author="pjones" created="Fri, 21 Sep 2012 08:36:51 +0000"  >&lt;p&gt;Ihara&lt;/p&gt;

&lt;p&gt;Is this different behaviour compared to earlier 1.8.x releases? If not could this be a known issue.&lt;/p&gt;

&lt;p&gt;Jinshan&lt;/p&gt;

&lt;p&gt;You did the performance improvements for mmap that went into 2.2 (under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-884&quot; title=&quot;Client In-Memory Data Checksum&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-884&quot;&gt;&lt;del&gt;LU-884&lt;/del&gt;&lt;/a&gt;) so can you comment?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="45358" author="jay" created="Fri, 21 Sep 2012 13:07:55 +0000"  >&lt;p&gt;This seems like readahead issue.&lt;/p&gt;

&lt;p&gt;Ihara, does the application use mmap to read file and write it via vfs interface? If this is the case, we can try to reproduce by issuing mmap read only.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;You did the performance improvements for mmap that went into 2.2 (under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-884&quot; title=&quot;Client In-Memory Data Checksum&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-884&quot;&gt;&lt;del&gt;LU-884&lt;/del&gt;&lt;/a&gt;) so can you comment?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;it smells unrelated. If my above guess is wrong, I will verify this.&lt;/p&gt;</comment>
                            <comment id="45374" author="ihara" created="Fri, 21 Sep 2012 21:32:43 +0000"  >&lt;p&gt;Jinshan, &lt;br/&gt;
I believe so.&lt;br/&gt;
any good idea to collect data to make sure what you said write it via vfs interface?&lt;/p&gt;</comment>
                            <comment id="45397" author="jay" created="Sat, 22 Sep 2012 12:23:58 +0000"  >&lt;p&gt;This could be nfs or lustre problem. Please show me the stats of:&lt;br/&gt;
1. llite.*.read_ahead_stats&lt;br/&gt;
2. llite.&amp;#42;.extents_stats and llite.&amp;#42;.extents_stats_per_process&lt;br/&gt;
3. osc.*.rpc_stats&lt;br/&gt;
4. nfsiostat on the client node(nfs server)&lt;/p&gt;</comment>
                            <comment id="45400" author="ihara" created="Sun, 23 Sep 2012 19:53:04 +0000"  >&lt;p&gt;Jinshan,&lt;/p&gt;

&lt;p&gt;the job needs the long time to finish, so, I started 8 jobs on single client, and collected statics only first 1800 sec. even first 1800 sec, we can see the performance difference between NFS on Lustre vs native Lustre.&lt;/p&gt;

&lt;p&gt;there is two test cases. (1xOSS, 1xOST and 1xClient)&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;1. Native lustre client
2. NFS on Lustre, (the lustre client mounts the lustre and export it with for NFS. same client mounts it via NFS as loopbackmount)

192.168.100.126@tcp:/lustre
                     15283944352  68657584 14450657920   1% /lustre
localhost:/lustre    15283944448  68657152 14450658304   1% /lustre-nfs

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here is output files during 1800sec. 2.3GB file per jobs are created when application runs through the native lustre client, but 7.4GB files per jobs are created when if application goes through the NFS to the Lustre.&lt;/p&gt;

&lt;p&gt;Native Lustre&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# cat lustre.8/output.log 
total 18161696
-rw-r--r-- 1 root root 2325741568 Sep 24 03:01 18855.out
-rw-r--r-- 1 root root 2325741568 Sep 24 03:01 18858.out
-rw-r--r-- 1 root root 2325741568 Sep 24 03:01 18861.out
-rw-r--r-- 1 root root 2325741568 Sep 24 03:01 18864.out
-rw-r--r-- 1 root root 2325741568 Sep 24 03:01 18867.out
-rw-r--r-- 1 root root 2325741568 Sep 24 03:01 18870.out
-rw-r--r-- 1 root root 2325741568 Sep 24 03:01 18873.out
-rw-r--r-- 1 root root 2325741568 Sep 24 03:01 18876.out
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;NFS on Lustre&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# cat nfs_on_lustre.8/output.log 
total 58710112
-rw-r--r-- 1 root root 7584350208 Sep 24 02:28 17205.out
-rw-r--r-- 1 root root 7462715392 Sep 24 02:28 17208.out
-rw-r--r-- 1 root root 7462715392 Sep 24 02:28 17211.out
-rw-r--r-- 1 root root 7462715392 Sep 24 02:28 17214.out
-rw-r--r-- 1 root root 7523532800 Sep 24 02:28 17217.out
-rw-r--r-- 1 root root 7523532800 Sep 24 02:28 17220.out
-rw-r--r-- 1 root root 7643070464 Sep 24 02:28 17223.out
-rw-r--r-- 1 root root 7523532800 Sep 24 02:28 17226.out
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can see Lustre statics as well. with native Lustre, mostly 1 page per RPC.&lt;/p&gt;

&lt;p&gt;Lustre&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;			read			write
pages per rpc         rpcs   % cum % |       rpcs   % cum %
1:		   4973077  99  99   |          0   0   0
2:		      9224   0  99   |          0   0   0
4:		      1216   0  99   |          1   0   0
8:		       949   0  99   |          0   0   0
16:		      1081   0  99   |          0   0   0
32:		      1255   0  99   |          0   0   0
64:		      1099   0  99   |          0   0   0
128:		       600   0  99   |          3   0   0
256:		     16625   0 100   |      17736  99 100
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;NFS on Lustre&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;			read			write
pages per rpc         rpcs   % cum % |       rpcs   % cum %
1:		      2394   7   7   |          0   0   0
2:		       812   2   9   |          0   0   0
4:		       935   2  12   |          1   0   0
8:		      1111   3  16   |          1   0   0
16:		      1624   5  21   |          1   0   0
32:		      2283   7  28   |          1   0   0
64:		      3083   9  37   |          0   0   0
128:		      4324  13  51   |          0   0   0
256:		     15755  48 100   |      57389  99 100
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="45461" author="jay" created="Mon, 24 Sep 2012 15:41:51 +0000"  >&lt;p&gt;From the readahead stats in the log file, it can interpret that what&apos;s going on:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;llite.lustre-ffff810744e39800.read_ahead_stats=
snapshot_time             1348423289.615683 secs.usecs
hits                      4345662 samples [pages]
misses                    5015426 samples [pages]
readpage not consecutive  5469845 samples [pages]
miss inside window        1812 samples [pages]
failed grab_cache_page    3187455 samples [pages]
read but discarded        15712 samples [pages]
zero size window          6238737 samples [pages]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and nfs over lustre:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;llite.lustre-ffff81054d3c6c00.read_ahead_stats=
snapshot_time             1348421290.346850 secs.usecs
hits                      4263548 samples [pages]
misses                    17453 samples [pages]
readpage not consecutive  335391 samples [pages]
failed grab_cache_page    5578837 samples [pages]
read but discarded        24 samples [pages]
zero size window          113 samples [pages]
hit max r-a issue         3 samples [pages]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;See the miss rate, lustre native is much higher, this caused one page RPCs were issued. The application is accessing the file in random pattern.  See the zero-size-window stats in native lustre case.&lt;/p&gt;

&lt;p&gt;(I don&apos;t check the code of nfs) I think there are two readahead in the nfs-over-lustre case, where 1, mmap on nfs client and 2, vfs_read in lustre. This makes the readahead would be a little more aggressive - the side effect is that the high rate of &apos;failed grab_cache_page&apos;, I guess it was short of memory at that time.&lt;/p&gt;

&lt;p&gt;In general, this kind of issue is hard to fix as what should be blamed is the application itself &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/wink.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;. Anyway, I&apos;ll think about how to improve readahead algorithm in mmap case when I have a free moment.&lt;/p&gt;</comment>
                            <comment id="45486" author="ihara" created="Tue, 25 Sep 2012 00:41:54 +0000"  >&lt;p&gt;Jinshan,&lt;/p&gt;

&lt;p&gt;I just tested it on lustre-2.1.3 and it worked well as well as that we are seeing numbers over nfs on lustre-1.8.x.&lt;br/&gt;
Do you have any hints we can try lustre-1.8.x if we see improved performance?&lt;/p&gt;</comment>
                            <comment id="45489" author="jay" created="Tue, 25 Sep 2012 01:59:34 +0000"  >&lt;p&gt;Interesting.. both rpc and read ahead stats are much better in 2.1.3. I remember there was a readahead bug fixed by wangdi. I&apos;ll update it here after I find it out.&lt;/p&gt;</comment>
                            <comment id="45490" author="jay" created="Tue, 25 Sep 2012 02:05:15 +0000"  >&lt;p&gt;Recently there are two readahead patches committed:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-983&quot; title=&quot;On a 1.8.6 client/host, tar is performing 10K reads, and the RPCs are typically one page in size for small files==tar is slow for small files using 1.8.6 vs 1.8.5&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-983&quot;&gt;&lt;del&gt;LU-983&lt;/del&gt;&lt;/a&gt;: f1a4b79e378407e4161c2a922478d625a38452b5&lt;br/&gt;
&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15&quot; title=&quot;strange slow IO messages and bad performance &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15&quot;&gt;&lt;del&gt;LU-15&lt;/del&gt;&lt;/a&gt;:  b7eb1d769cc80216e353c4e4217dfe9070927139&lt;/p&gt;

&lt;p&gt;Can you please check if they are in your branch?&lt;/p&gt;</comment>
                            <comment id="45493" author="ihara" created="Tue, 25 Sep 2012 02:09:56 +0000"  >&lt;p&gt;Hm. these are already landed in 1.8.8 (or before 1.8.8). We are using lustre-1.8.8 here, these patches should be included..&lt;/p&gt;</comment>
                            <comment id="45495" author="di.wang" created="Tue, 25 Sep 2012 02:46:01 +0000"  >&lt;p&gt;Ihara, could you please gather some &quot;+reada&quot; debug log(maybe only reada, lctl set_param debug=reada) for me during the running? Thanks.&lt;/p&gt;</comment>
                            <comment id="45500" author="ihara" created="Tue, 25 Sep 2012 05:34:12 +0000"  >&lt;p&gt;Wangdi,&lt;br/&gt;
I collected a couple of debug log during running the application. Please check on them.&lt;br/&gt;
Thanks&lt;/p&gt;</comment>
                            <comment id="45541" author="di.wang" created="Tue, 25 Sep 2012 20:14:29 +0000"  >&lt;p&gt;Ihara, I just post a patch here, &lt;a href=&quot;http://review.whamcloud.com/4097&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/4097&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Could you please try it? According to the debug log you post, it seems Readahead does not work well in mmap, the reason why exporting NFS works better is that NFS will open the file for every read, which refreshs the readahead state, and it happens to help &quot;correct&quot; the behaviour. Also I found a lot random 4k read in the debug log, is this expected for this job? And I also do not understand why 2.1.3 works better here. So could you please&lt;/p&gt;

&lt;p&gt;1 set debug mask as  &quot;warning error emerg vfstrace reada console&quot; &lt;/p&gt;

&lt;p&gt;2. run the test over 1.8.8 and 2.1.3.&lt;/p&gt;

&lt;p&gt;And post the debug_log , llite.&lt;b&gt;.read_ahead_stats and  osc.&lt;/b&gt;.rpc_stats here.  Thanks.  &lt;/p&gt;



</comment>
                            <comment id="45610" author="ihara" created="Wed, 26 Sep 2012 18:37:16 +0000"  >&lt;p&gt;wangdi, yes, there are few inputs files during the single job. The application reads these input files on some points, but it&apos;s very random read IO I believe. &lt;/p&gt;

&lt;p&gt;Anyway, I tested patches, but it didn&apos;t help very much. And, I collected debug and statics the following version and uploaded files to under /uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2001&quot; title=&quot;read operation is slow when mmap is enabled&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2001&quot;&gt;&lt;del&gt;LU-2001&lt;/del&gt;&lt;/a&gt;/ on ftp site. The job is very long, so, for testing, I killed it after first 30 mins. I collected debug files on 3 points, 5 mins, 15 mins and 25 mins after job starts.&lt;/p&gt;

&lt;p&gt;centos5.8-lustre-1.8.8&lt;br/&gt;
centos5.8-lustre-1.8.8-patch1 (&lt;a href=&quot;http://review.whamcloud.com/4097&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/4097&lt;/a&gt;)&lt;br/&gt;
centos6.3-lustre-1.8.8&lt;br/&gt;
centos6.3-lustre-2.1.3&lt;/p&gt;

&lt;p&gt;The Linux distro is client side and the server side is running with lustre-2.1.3 on centos6.3. I tested lustre-1.8.8 on centos5.8 for server, but no much differences. For 2.1.3 vs 1.8.x testing, I used 2.1.3 on Centos6.3 for server.&lt;/p&gt;

&lt;p&gt;Please check and analyze on them.&lt;/p&gt;

&lt;p&gt;Thank you!&lt;/p&gt;</comment>
                            <comment id="45666" author="di.wang" created="Thu, 27 Sep 2012 17:57:44 +0000"  >&lt;p&gt;Ihara, I did not see 2.1.3 debug log there? &lt;br/&gt;
wangdi@brent:/scratch/ftp/uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2001&quot; title=&quot;read operation is slow when mmap is enabled&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2001&quot;&gt;&lt;del&gt;LU-2001&lt;/del&gt;&lt;/a&gt;$ ll -h&lt;br/&gt;
total 56M&lt;br/&gt;
drwxr-xr-x  2 ftp mlocate  4.0K 2012-09-26 15:10 ./&lt;br/&gt;
d-wxrwx--- 36 ftp ipausers 4.0K 2012-09-26 15:09 ../&lt;br/&gt;
&lt;del&gt;rw-r&lt;/del&gt;&lt;del&gt;r&lt;/del&gt;-  1 ftp mlocate   20M 2012-09-26 15:09 centos5.8-lustre-1.8.8-IB-patch1.tar.gz&lt;br/&gt;
&lt;del&gt;rw-r&lt;/del&gt;&lt;del&gt;r&lt;/del&gt;-  1 ftp mlocate   18M 2012-09-26 15:09 centos5.8-lustre-1.8.8-IB.tar.gz&lt;br/&gt;
&lt;del&gt;rw-r&lt;/del&gt;&lt;del&gt;r&lt;/del&gt;-  1 ftp mlocate   19M 2012-09-26 15:10 centos6.3-lustre-1.8.8-IB.tar.gz&lt;/p&gt;

&lt;p&gt;Could you please upload 2.1.3 log for me, if you have. Thanks.&lt;/p&gt;</comment>
                            <comment id="45679" author="ihara" created="Thu, 27 Sep 2012 18:36:28 +0000"  >&lt;p&gt;Di, sorry, just uploaded centos6.3-lustre-2.1.3-IB.tar.gz&lt;/p&gt;

&lt;p&gt;btw, the customer environment is 1.8.x, we want to fix and speedup on 1.8.x that we are seeing nfs on lustre-1.8.x.. &lt;/p&gt;

&lt;p&gt;thank you very much!&lt;/p&gt;</comment>
                            <comment id="45693" author="di.wang" created="Thu, 27 Sep 2012 21:17:51 +0000"  >&lt;p&gt;Ihara, I just upgraded the patch, &lt;a href=&quot;http://review.whamcloud.com/#change,4097&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,4097&lt;/a&gt;  Could you please try this one, and get those information for me. Thanks.&lt;/p&gt;</comment>
                            <comment id="45779" author="ihara" created="Sat, 29 Sep 2012 23:32:46 +0000"  >&lt;p&gt;Di, I tested new patch (patch set 4), but still no luck. The original behavior was better. I&apos;ve uploaded the debug information and results on the same place. And, the application is opensource, I uploaded it as well if it helps for analysis.&lt;/p&gt;

&lt;p&gt;Thanks!&lt;/p&gt;</comment>
                            <comment id="46038" author="di.wang" created="Fri, 5 Oct 2012 02:00:09 +0000"  >&lt;p&gt;Ihara, unfortunately I am busy with another urgent project these days, so I did not take time to see your log yet. And I will take vacation next week, my colleague will probably continue work on this problem. Could you please tell me how did you run LAST (exact command line please)? Thanks. &lt;/p&gt;</comment>
                            <comment id="46067" author="ihara" created="Fri, 5 Oct 2012 16:38:20 +0000"  >&lt;p&gt;Hi WangDi, &lt;br/&gt;
Let me check if we can give you sample input files that you can reproduce problem.&lt;/p&gt;</comment>
                            <comment id="46522" author="ihara" created="Sat, 13 Oct 2012 23:35:14 +0000"  >&lt;p&gt;Hi I uploaded all comporments (includes input files) on uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2001&quot; title=&quot;read operation is slow when mmap is enabled&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2001&quot;&gt;&lt;del&gt;LU-2001&lt;/del&gt;&lt;/a&gt;.&lt;br/&gt;
If you mounted the lustre to /lustre on the client, you can just extract last.tar.bz2 on it, then copy run.sh to the /lustre and run it that what we are exactly doing.&lt;/p&gt;

&lt;p&gt;Please let me know if you have any questions.&lt;/p&gt;</comment>
                            <comment id="47237" author="ihara" created="Thu, 1 Nov 2012 03:19:25 +0000"  >&lt;p&gt;WanDi, did you have an chance to test LAST on your environment? &lt;/p&gt;</comment>
                            <comment id="47367" author="di.wang" created="Sat, 3 Nov 2012 23:44:20 +0000"  >&lt;p&gt;Hi, ihara&lt;/p&gt;

&lt;p&gt;Sorry, not yet. I am quite busy with DNE since I got back from vacation. I will try to set aside sometime next week. Btw: did you try this test on a local environment? Does that mean I have to setup a 30GB lustre environment to repeat what you have done? &lt;/p&gt;</comment>
                            <comment id="47382" author="di.wang" created="Mon, 5 Nov 2012 01:08:57 +0000"  >&lt;p&gt;Hi, ihara&lt;/p&gt;

&lt;p&gt;Hmm, I just checked patch-4(centos5.8-lustre-1.8.8-IB-patch4) result. The rpc quality seems better&lt;/p&gt;

&lt;p&gt;osc.lustre-OST0000-osc-ffff88106a6ab000.rpc_stats=&lt;br/&gt;
snapshot_time:            1348974722.320932 (secs.usecs)&lt;br/&gt;
read RPCs in flight:      0&lt;br/&gt;
write RPCs in flight:     0&lt;br/&gt;
dio read RPCs in flight:  0&lt;br/&gt;
dio write RPCs in flight: 0&lt;br/&gt;
pending write pages:      0&lt;br/&gt;
pending read pages:       0&lt;/p&gt;

&lt;p&gt;                        read                    write&lt;br/&gt;
pages per rpc         rpcs   % cum % |       rpcs   % cum %&lt;br/&gt;
1:                   46178  12  12   |          0   0   0&lt;br/&gt;
2:                   21499   5  18   |          0   0   0&lt;br/&gt;
4:                   26737   7  25   |          0   0   0&lt;br/&gt;
8:                   32963   9  34   |          0   0   0&lt;br/&gt;
16:                  41309  11  46   |          0   0   0&lt;br/&gt;
32:                  51106  14  60   |          0   0   0&lt;br/&gt;
64:                  52684  14  74   |          0   0   0&lt;br/&gt;
128:                 45786  12  87   |          4   0   0&lt;br/&gt;
256:                 45763  12 100   |       4350  99 100&lt;/p&gt;

&lt;p&gt;Here is the original result&lt;/p&gt;

&lt;p&gt;osc.lustre-OST0000-osc-ffff81107910cc00.rpc_stats=&lt;br/&gt;
snapshot_time:            1348683325.358237 (secs.usecs)&lt;br/&gt;
read RPCs in flight:      0&lt;br/&gt;
write RPCs in flight:     0&lt;br/&gt;
dio read RPCs in flight:  0&lt;br/&gt;
dio write RPCs in flight: 0&lt;br/&gt;
pending write pages:      0&lt;br/&gt;
pending read pages:       0&lt;/p&gt;

&lt;p&gt;                        read                    write&lt;br/&gt;
pages per rpc         rpcs   % cum % |       rpcs   % cum %&lt;br/&gt;
1:                 6490484  99  99   |          0   0   0&lt;br/&gt;
2:                   21527   0  99   |          0   0   0&lt;br/&gt;
4:                    4072   0  99   |          0   0   0&lt;br/&gt;
8:                    3115   0  99   |          0   0   0&lt;br/&gt;
16:                   2570   0  99   |          0   0   0&lt;br/&gt;
32:                   1454   0  99   |          1   0   0&lt;br/&gt;
64:                    548   0  99   |          0   0   0&lt;br/&gt;
128:                   195   0  99   |          5   0   0&lt;br/&gt;
256:                 16623   0 100   |      27792  99 100&lt;/p&gt;


&lt;p&gt;So why do you said &quot;the original behaviour was better&quot; ? Please correct me if I missed sth.&lt;/p&gt;



</comment>
                            <comment id="47386" author="ihara" created="Mon, 5 Nov 2012 02:41:43 +0000"  >&lt;p&gt;WangDi,&lt;/p&gt;

&lt;p&gt;Thanks for your testing. Yes, I also saw better behavior on rpc_stats with patch4, but read performance was dropped compared without patch.&lt;/p&gt;

&lt;p&gt;Also, when did you get this statistics after job starts? My understanding, once job starts it read files with mmap(), then, compute something (and read() ) and write out files with write(). My statices was after the mmap().&lt;/p&gt;</comment>
                            <comment id="47400" author="di.wang" created="Mon, 5 Nov 2012 12:49:34 +0000"  >&lt;p&gt;Oh, I did not try the test, Just looked at the result you post here. I do not have rhel5 environment locally. I will see what I can do here. Thanks.&lt;/p&gt;</comment>
                            <comment id="47501" author="di.wang" created="Tue, 6 Nov 2012 23:09:35 +0000"  >&lt;p&gt;Hi, Ihara&lt;/p&gt;

&lt;p&gt;sorry, I can not find node to run the test locally, all my nodes are rhel6 here. I investigated those upload logs again, centos6.3-lustre-2.1.3-IB is for lustre 2.1 or lustre 2.1 + nfs, I saw a lot read, instead of mmap in this log. Could you please check. Do you have debug log for lustre 1.8 + nfs ?  &lt;/p&gt;</comment>
                            <comment id="61536" author="ihara" created="Sat, 29 Jun 2013 09:51:49 +0000"  >&lt;p&gt;comparing miss rates of lustre-2.4, lustre-1.8 and nfs over lustre-1.8.&lt;/p&gt;</comment>
                            <comment id="61537" author="ihara" created="Sat, 29 Jun 2013 09:58:11 +0000"  >&lt;p&gt;lustre-1.8&apos;s peak miss rates were much higher than lustre-2.x and nfs over lustre-1.8 during application was running.&lt;br/&gt;
llap_shrink_cache_internail() is shrinking memory even the page is still active. This is related to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1878&quot; title=&quot;NULL pointer in ll_readahead()&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1878&quot;&gt;&lt;del&gt;LU-1878&lt;/del&gt;&lt;/a&gt; (&lt;a href=&quot;http://review.whamcloud.com/#/c/4026&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/4026&lt;/a&gt;), but it doesn&apos;t solve our problme even yet. We will file new patches soon.&lt;/p&gt;</comment>
                            <comment id="61538" author="lixi" created="Sat, 29 Jun 2013 10:38:40 +0000"  >&lt;p&gt;Here is the patch that makes the application much faster.&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/6826/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/6826/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="62664" author="lixi" created="Mon, 22 Jul 2013 01:51:32 +0000"  >&lt;p&gt;Again, following patch makes the application runs much faster.&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/7064/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/7064/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="220464" author="jay" created="Thu, 8 Feb 2018 18:26:39 +0000"  >&lt;p&gt;close old tickets&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="11923" name="debug-1.txt.gz" size="3315983" author="ihara" created="Tue, 25 Sep 2012 05:34:12 +0000"/>
                            <attachment id="11924" name="debug-2.txt.gz" size="3093353" author="ihara" created="Tue, 25 Sep 2012 05:34:12 +0000"/>
                            <attachment id="11925" name="debug-3.txt.gz" size="6112843" author="ihara" created="Tue, 25 Sep 2012 05:34:12 +0000"/>
                            <attachment id="11922" name="lustre-2.1.3-TCP-8thr.zip" size="2241" author="ihara" created="Tue, 25 Sep 2012 00:41:54 +0000"/>
                            <attachment id="11902" name="lustre.8.zip" size="12977" author="ihara" created="Sun, 23 Sep 2012 19:53:04 +0000"/>
                            <attachment id="13098" name="miss-rates.xlsx" size="36884" author="ihara" created="Sat, 29 Jun 2013 09:51:49 +0000"/>
                            <attachment id="11903" name="nfs_on_lustre.8.zip" size="13025" author="ihara" created="Sun, 23 Sep 2012 19:53:04 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10490" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>End date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 22 Jul 2013 07:11:00 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvuc7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>8891</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10493" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>Start date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 21 Sep 2012 07:11:00 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    </customfields>
    </item>
</channel>
</rss>