<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:02:10 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6663] DNE2 directories has very very bad performance</title>
                <link>https://jira.whamcloud.com/browse/LU-6663</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;With my DNE2 setup I attempted to start some jobs on our test cluster and the job got stuck for hours attempting to run. So I did testing to see what was breaking. A simple md5sum on files showed the problem very easily. For normal directories md5sum on a file came back very fast.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@ninja06 johndoe&amp;#93;&lt;/span&gt;# date;md5sum ior;date&lt;br/&gt;
Fri May 29 10:08:24 EDT 2015&lt;br/&gt;
4ba1b26f0a4b71dccb237d3fd25f3b67 ior&lt;br/&gt;
Fri May 29 10:08:24 EDT 2015&lt;/p&gt;

&lt;p&gt;But for DNE2 directories I saw this:&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@ninja06 jsimmons&amp;#93;&lt;/span&gt;# date;md5sum simul;date&lt;br/&gt;
Fri May 29 10:08:38 EDT 2015&lt;br/&gt;
9fef8669fb0e6669ac646d69062521d3 simul&lt;br/&gt;
Fri May 29 10:09:59 EDT 2015&lt;/p&gt;

&lt;p&gt;This is not a issue for stats. So it only appears when accessing data from a file. An ls -al on the file simul comes back very fast. I did this test without the work from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3534&quot; title=&quot;async update cross-MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3534&quot;&gt;&lt;del&gt;LU-3534&lt;/del&gt;&lt;/a&gt; and the problem still exist.&lt;/p&gt;</description>
                <environment>The problem only happens with the contents in an DNE2 directory</environment>
        <key id="30424">LU-6663</key>
            <summary>DNE2 directories has very very bad performance</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="6">Not a Bug</resolution>
                                        <assignee username="di.wang">Di Wang</assignee>
                                    <reporter username="simmonsja">James A Simmons</reporter>
                        <labels>
                            <label>dne2</label>
                    </labels>
                <created>Fri, 29 May 2015 15:25:46 +0000</created>
                <updated>Mon, 15 Jun 2015 20:25:33 +0000</updated>
                            <resolved>Thu, 11 Jun 2015 16:16:54 +0000</resolved>
                                    <version>Lustre 2.8.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>11</watches>
                                                                            <comments>
                            <comment id="116868" author="di.wang" created="Fri, 29 May 2015 16:45:27 +0000"  >&lt;p&gt;Hmm, I tried this on my local test node. Unfortunately, I can not reproduce it with 4 MDTs and 4 stripe_count. James, could you please tell me more about your system, how many MDTs and what is the stripe_count of the striped directory. And what is the size of simul and ior file. Thanks.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@testnode tests]# date; md5sum /mnt/lustre/test2/dd; date
Wed May 27 19:56:37 PDT 2015
d8b61b2c0025919d5321461045c8226f  /mnt/lustre/test2/dd
Wed May 27 19:56:38 PDT 2015
[root@testnode tests]# date; md5sum /mnt/lustre/test1/dd; date
Wed May 27 19:56:44 PDT 2015
d8b61b2c0025919d5321461045c8226f  /mnt/lustre/test1/dd
Wed May 27 19:56:45 PDT 2015
[root@testnode tests]# ../utils/lfs getdirstripe /mnt/lustre/test1
/mnt/lustre/test1
lmv_stripe_count: 4 lmv_stripe_offset: 0
mdtidx		 FID[seq:oid:ver]
     0		 [0x200000400:0x2:0x0]		
     1		 [0x240000403:0x2:0x0]		
     2		 [0x280000403:0x2:0x0]		
     3		 [0x2c0000401:0x2:0x0]		
[root@testnode tests]# ../utils/lfs getdirstripe /mnt/lustre/test2
/mnt/lustre/test2
lmv_stripe_count: 0 lmv_stripe_offset: 0
[root@testnode tests]# ll /mnt/lustre/test2/
total 512004
-rw-r--r-- 1 root root 524288000 May 27 19:56 dd
[root@testnode tests]# ll /mnt/lustre/test1/
total 512004
-rw-r--r-- 1 root root 524288000 May 27 19:56 dd
[root@testnode tests]# 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="116882" author="adilger" created="Fri, 29 May 2015 17:25:13 +0000"  >&lt;p&gt;James, can you run an starve to see where md5sum is spending its time?  Maybe also with +rpctrace debugging logs of strace isn&apos;t clear. Also, what back-end filesystem are you using for the MDTs and OSTs?&lt;/p&gt;</comment>
                            <comment id="116883" author="di.wang" created="Fri, 29 May 2015 17:32:02 +0000"  >&lt;p&gt;Hmm, I also checked RPC count after md5sum&lt;/p&gt;

&lt;p&gt;1000 1k files on this striped directory &lt;br/&gt;
(umount/mount before the test to clear cache, and also echo 0 &amp;gt; /proc/*..../stats) &lt;br/&gt;
The RPC stats after &quot;md5sum *&quot;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@testnode test1]# cat /proc/fs/lustre/mdc/lustre-MDT000*-mdc-*/stats
snapshot_time             1432783774.817424 secs.usecs
req_waittime              759 samples [usec] 33 311 81604 9966800
req_active                759 samples [reqs] 1 1 759 759
mds_close                 251 samples [usec] 33 311 18595 1595875
obd_ping                  4 samples [usec] 80 170 527 74209
snapshot_time             1432783774.817449 secs.usecs
req_waittime              750 samples [usec] 34 279 81081 10035485
req_active                750 samples [reqs] 1 1 750 750
mds_close                 249 samples [usec] 34 180 18120 1492930
obd_ping                  3 samples [usec] 59 169 363 50267
snapshot_time             1432783774.817466 secs.usecs
req_waittime              751 samples [usec] 33 294 81646 10083800
req_active                751 samples [reqs] 1 1 751 751
mds_close                 249 samples [usec] 33 153 18560 1536386
obd_ping                  2 samples [usec] 109 128 237 28265
snapshot_time             1432783774.817483 secs.usecs
req_waittime              755 samples [usec] 32 369 81239 10030651
req_active                755 samples [reqs] 1 1 755 755
mds_close                 251 samples [usec] 32 199 18483 1542053
obd_ping                  2 samples [usec] 175 189 364 66346
[root@testnode test1]# cat /proc/fs/lustre/osc/lustre-OST0001-osc-ffff8801e8e49400/stats 
snapshot_time             1432784238.566254 secs.usecs
req_waittime              1012 samples [usec] 36 424 96854 10257148
req_active                1012 samples [reqs] 1 1 1012 1012
read_bytes                499 samples [bytes] 896 896 447104 400605184
ost_read                  499 samples [usec] 42 253 48718 5248004
obd_ping                  14 samples [usec] 54 160 1551 186601
[root@testnode test1]# cat /proc/fs/lustre/osc/lustre-OST0000-osc-ffff8801e8e49400/stats 
snapshot_time             1432784253.55197 secs.usecs
req_waittime              1022 samples [usec] 22 288 96836 10097582
req_active                1022 samples [reqs] 1 1 1022 1022
read_bytes                501 samples [bytes] 896 896 448896 402210816
ost_read                  501 samples [usec] 41 288 47596 4964502
obd_ping                  20 samples [usec] 22 177 2295 305467
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here is the result for non-striped directory&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@testnode test2]# cat /proc/fs/lustre/mdc/lustre-MDT000*-mdc-*/stats
snapshot_time             1432783875.323394 secs.usecs
req_waittime              3006 samples [usec] 33 483 341808 43798954
req_active                3006 samples [reqs] 1 1 3006 3006
mds_close                 1000 samples [usec] 33 270 87747 8322307
mds_readpage              1 samples [usec] 483 483 483 233289
obd_ping                  4 samples [usec] 64 118 372 36198
snapshot_time             1432783875.323418 secs.usecs
req_waittime              5 samples [usec] 29 113 386 34914
req_active                5 samples [reqs] 1 1 5 5
obd_ping                  5 samples [usec] 29 113 386 34914
snapshot_time             1432783875.323433 secs.usecs
req_waittime              4 samples [usec] 84 115 379 36499
req_active                4 samples [reqs] 1 1 4 4
obd_ping                  4 samples [usec] 84 115 379 36499
snapshot_time             1432783875.323449 secs.usecs
req_waittime              4 samples [usec] 22 198 503 83013
req_active                4 samples [reqs] 1 1 4 4
obd_ping                  4 samples [usec] 22 198 503 83013
[root@testnode test2]# cat /proc/fs/lustre/osc/lustre-OST0000-osc-ffff88019a235000/stats
snapshot_time             1432784328.527276 secs.usecs
req_waittime              1011 samples [usec] 30 349 91417 9311277
req_active                1011 samples [reqs] 1 1 1011 1011
read_bytes                500 samples [bytes] 896 896 448000 401408000
ost_read                  500 samples [usec] 44 349 44299 4465957
ost_connect               1 samples [usec] 109 109 109 11881
obd_ping                  10 samples [usec] 30 209 1155 159765
[root@testnode test2]# cat /proc/fs/lustre/osc/lustre-OST0001-osc-ffff88019a235000/stats
snapshot_time             1432784335.389316 secs.usecs
req_waittime              1012 samples [usec] 29 284 91225 9297035
req_active                1012 samples [reqs] 1 1 1012 1012
read_bytes                500 samples [bytes] 896 896 448000 401408000
ost_read                  500 samples [usec] 42 284 44763 4518767
ost_connect               1 samples [usec] 110 110 110 12100
obd_ping                  11 samples [usec] 29 248 1178 163494
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Seems nothing unusual. James, are these files under ior or simul have same data stripes for striped_directory and non-striped_directory? And also if you can repeat the problem on the testnode. Could you please collect the debug log (-1 level) on the client side for me? thanks!&lt;/p&gt;</comment>
                            <comment id="116887" author="simmonsja" created="Fri, 29 May 2015 17:42:07 +0000"  >&lt;p&gt;Andreas I get the following strafe of md5sum ./simul.&lt;/p&gt;

&lt;p&gt;open(&quot;simul&quot;, O_RDONLY)                 = 3&lt;br/&gt;
fstat(3, &lt;/p&gt;
{st_mode=S_IFREG|0755, st_size=6862991, ...}
&lt;p&gt;) = 0&lt;br/&gt;
mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fccf6689000&lt;br/&gt;
read(3, &quot;\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\2\0&amp;gt;\0\1\0\0\0004\16@\0\0\0\0\0&quot;..., 4194304) = 4194304&lt;br/&gt;
read(3, &quot;\0\0\2# \22%\17\0\0\10\353\21\t\0\0\2#(\22d\37\0\0\10\354;\t\0\0\2#&quot;..., 4194304) = 2668687&lt;br/&gt;
read(3, &quot;&quot;, 4194304)                    = 0&lt;br/&gt;
close(3)                                = 0&lt;br/&gt;
munmap(0x7fccf6689000, 4194304)         = 0&lt;/p&gt;

&lt;p&gt;Its the first read that is taking so long.&lt;/p&gt;

&lt;p&gt;BTW the second time you run this command no lag happens. So client side caching hides the problem.&lt;/p&gt;</comment>
                            <comment id="116890" author="simmonsja" created="Fri, 29 May 2015 17:54:14 +0000"  >&lt;p&gt;Everything is the test system is running the same Lustre version which is master + DNE patches and a few other patches as well. With no patches the problem still exist. The back end servers are running ldiskfs. Their are 16 MDS servers each with one MDT. I have 4 OSS servers with a total of 56 OSTs. I created myself a home directory using lfs setdirstripe -c 16 which contain the simul binaries which I run md5sum on that binary. I have a few other binaries as well for testing purposes. Here is RPC stats&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@ninja06 ~&amp;#93;&lt;/span&gt;# cat /proc/fs/lustre/mdc/sultan-MDT0000-mdc-ffff8808041f1800/stats;md5sum /lustre/sultan/s&lt;br/&gt;
tf008/scratch/jsimmons/simul;cat /proc/fs/lustre/mdc/sultan-MDT0000-mdc-ffff8808041f1800/stats&lt;br/&gt;
snapshot_time             1432921773.276703 secs.usecs&lt;br/&gt;
req_waittime              595 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 55 16742 115819 318800537&lt;br/&gt;
req_active                595 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;reqs&amp;#93;&lt;/span&gt; 1 3 821 1279&lt;br/&gt;
mds_getattr               1 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 241 241 241 58081&lt;br/&gt;
mds_close                 4 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 143 4111 4704 17023942&lt;br/&gt;
mds_readpage              4 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 833 926 3558 3170490&lt;br/&gt;
mds_connect               1 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 16742 16742 16742 280294564&lt;br/&gt;
mds_getstatus             1 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 280 280 280 78400&lt;br/&gt;
mds_statfs                1 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 347 347 347 120409&lt;br/&gt;
ldlm_cancel               275 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 63 386 35964 5427452&lt;br/&gt;
obd_ping                  2 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 255 363 618 196794&lt;br/&gt;
fld_query                 16 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 55 211 2082 304252&lt;br/&gt;
9fef8669fb0e6669ac646d69062521d3  /lustre/sultan/stf008/scratch/jsimmons/simul&lt;br/&gt;
snapshot_time             1432921837.998247 secs.usecs&lt;br/&gt;
req_waittime              597 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 55 16742 116370 318952518&lt;br/&gt;
req_active                597 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;reqs&amp;#93;&lt;/span&gt; 1 3 823 1281&lt;br/&gt;
mds_getattr               1 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 241 241 241 58081&lt;br/&gt;
mds_close                 4 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 143 4111 4704 17023942&lt;br/&gt;
mds_readpage              4 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 833 926 3558 3170490&lt;br/&gt;
mds_connect               1 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 16742 16742 16742 280294564&lt;br/&gt;
mds_getstatus             1 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 280 280 280 78400&lt;br/&gt;
mds_statfs                1 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 347 347 347 120409&lt;br/&gt;
ldlm_cancel               276 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 63 386 36230 5498208&lt;br/&gt;
obd_ping                  2 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 255 363 618 196794&lt;br/&gt;
fld_query                 16 samples &lt;span class=&quot;error&quot;&gt;&amp;#91;usec&amp;#93;&lt;/span&gt; 55 211 2082 304252&lt;/p&gt;</comment>
                            <comment id="116892" author="simmonsja" created="Fri, 29 May 2015 18:09:21 +0000"  >&lt;p&gt;I uploaded full debug logs from the client to ftp.whamcloud.com//uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;-log.&lt;/p&gt;</comment>
                            <comment id="116894" author="di.wang" created="Fri, 29 May 2015 18:44:20 +0000"  >&lt;p&gt;Thanks James, I just glanced debug log a bit, it seems to me all of the time costs is in CLIO stack. I do not understand why this only happens under striped directory. James do you have the debug logs for non-striped directory?  I wonder if this is related with the recent CLIO simplification changes. I will check the log carefully. &lt;/p&gt;</comment>
                            <comment id="116908" author="simmonsja" created="Fri, 29 May 2015 20:18:14 +0000"  >&lt;p&gt;Yes, I just uploaded &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;-log-nonstripe to the same directory on ftp.whamcloud.com&lt;/p&gt;</comment>
                            <comment id="116918" author="di.wang" created="Fri, 29 May 2015 20:51:32 +0000"  >&lt;p&gt;Just checked the debug log, here is what happens.&lt;/p&gt;

&lt;p&gt;It is file_read costs most of the time, &lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000080:00200000:0.0:1432922231.938967:0:14593:0:(file.c:1088:ll_file_io_generic()) file: simul, type: 0 ppos: 0, count: 4194304
...
00000080:00200000:0.0:1432922285.038890:0:14593:0:(file.c:1196:ll_file_io_generic()) iot: 0, result: 4194304
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;So it takes about 54 seconds to read this 4MB file.&lt;/p&gt;

&lt;p&gt;And inside this read, it sent out 7 read RPCs, let&apos;s see the first 4 of them, the other 3 are read-ahead RPCs.  The RPC size seems good.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000008:00000002:5.0:1432922231.992289:0:21013:0:(osc_request.c:1829:osc_build_rpc()) @@@ 256 pages, aa ffff8810520764e0. now 1r/0w in flight  req@ffff881052076380 x1502512486180268/t0(0) o3-&amp;gt;sultan-OST0035-osc-ffff8807f7983c00@10.37.248.70@o2ib1:6/4 lens 488/432 e 0 to 0 dl 0 ref 2 fl New:/0/ffffffff rc 0/-1

00000008:00000002:2.0:1432922231.998420:0:21015:0:(osc_request.c:1829:osc_build_rpc()) @@@ 256 pages, aa ffff8807f24e9de0. now 1r/0w in flight  req@ffff8807f24e9c80 x1502512486180276/t0(0) o3-&amp;gt;sultan-OST0036-osc-ffff8807f7983c00@10.37.248.71@o2ib1:6/4 lens 488/432 e 0 to 0 dl 0 ref 2 fl New:/0/ffffffff rc 0/-1

00000008:00000002:2.0:1432922232.004918:0:21015:0:(osc_request.c:1829:osc_build_rpc()) @@@ 256 pages, aa ffff8807f24e9ae0. now 1r/0w in flight  req@ffff8807f24e9980 x1502512486180284/t0(0) o3-&amp;gt;sultan-OST0037-osc-ffff8807f7983c00@10.37.248.72@o2ib1:6/4 lens 488/432 e 0 to 0 dl 0 ref 2 fl New:/0/ffffffff rc 0/-1

00000008:00000002:6.0:1432922232.011237:0:21017:0:(osc_request.c:1829:osc_build_rpc()) @@@ 256 pages, aa ffff8810553fade0. now 1r/0w in flight  req@ffff8810553fac80 x1502512486180292/t0(0) o3-&amp;gt;sultan-OST0000-osc-ffff8807f7983c00@10.37.248.69@o2ib1:6/4 lens 488/432 e 0 to 0 dl 0 ref 2 fl New:/0/ffffffff rc 0/-1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It seems only the 1st one get reply in time&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:00000040:4.0:1432922231.994268:0:21016:0:(lustre_net.h:2439:ptlrpc_rqphase_move()) @@@ move req &quot;Bulk&quot; -&amp;gt; &quot;Interpret&quot;  req@ffff881052076380 x1502512486180268/t0(0) o3-&amp;gt;sultan-OST0035-osc-ffff8807f7983c00@10.37.248.70@o2ib1:6/4 lens 488/400 e 0 to 0 dl 1432922245 ref 2 fl Bulk:RM/0/0 rc 1048576/1048576
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;The other 3 RPCs are all get early reply and then get the pages after around 27 seconds. Here is the one example&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:00001000:5.0:1432922239.998446:0:21013:0:(client.c:468:ptlrpc_at_recv_early_reply()) @@@ Early reply #1, new deadline in 56s (50s)  req@ffff8807f24e9c80 x1502512486180276/t0(0) o3-&amp;gt;sultan-OST0036-osc-ffff8807f7983c00@10.37.248.71@o2ib1:6/4 lens 488/432 e 1 to 0 dl 1432922295 ref 2 fl Rpc:/0/ffffffff rc 0/-1

00000100:00000040:4.0:1432922258.003100:0:21013:0:(lustre_net.h:2439:ptlrpc_rqphase_move()) @@@ move req &quot;Bulk&quot; -&amp;gt; &quot;Interpret&quot;  req@ffff8807f24e9c80 x1502512486180276/t0(0) o3-&amp;gt;sultan-OST0036-osc-ffff8807f7983c00@10.37.248.71@o2ib1:6/4 lens 488/400 e 1 to 0 dl 1432922295 ref 2 fl Bulk:RM/0/0 rc 1048576/1048576
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I guess either something happens on the OST side or network.  &lt;/p&gt;

&lt;p&gt;James, are there anything special about these nodes 10.37.248.71,10.37.248.72,10.37.248.69 ? Thanks&lt;/p&gt;</comment>
                            <comment id="116923" author="di.wang" created="Fri, 29 May 2015 21:16:57 +0000"  >&lt;p&gt;Interesting, I checked the debug log for non_striped directory.  And the RPC size and stats are quite similar as striped dir case, but every RPC gets reply immediately&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000008:00000002:1.0:1432929764.067930:0:21017:0:(osc_request.c:1829:osc_build_rpc()) @@@ 256 pages, aa ffff880871aef1e0. now 1r/0w in flight  req@ffff880871aef080 x1502512486987692/t0(0) o3-&amp;gt;sultan-OST001a-osc-ffff880804b97800@10.37.248.71@o2ib1:6/4 lens 488/432 e 0 to 0 dl 0 ref 2 fl New:/0/ffffffff rc 0/-1

....

00000100:00000040:0.0:1432929764.082631:0:21011:0:(lustre_net.h:2439:ptlrpc_rqphase_move()) @@@ move req &quot;Bulk&quot; -&amp;gt; &quot;Interpret&quot;  req@ffff880871aef080 x1502512486987708/t0(0) o3-&amp;gt;sultan-OST001c-osc-ffff880804b97800@10.37.248.69@o2ib1:6/4 lens 488/400 e 0 to 0 dl 1432929778 ref 2 fl Bulk:RM/0/0 rc 1048576/1048576
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;James, is this issue easily reproduced on your testnode? Could you please send me both clients and OSSs debug logs? Thanks&lt;/p&gt;</comment>
                            <comment id="116925" author="adilger" created="Fri, 29 May 2015 21:25:35 +0000"  >&lt;p&gt;Di, James, one thing to consider is if the FID for the OST object is potentially causing some problem?  That would be something unique to DNE - that the OST objects for files created on MDT000x, x != 0, use FID-on-OST and objects created on MDT0000 use old-style IDIF FIDs.  &lt;/p&gt;

&lt;p&gt;That said, it is strange to see this only on James&apos; system, but possibly it relates to the number of MDTs * OSTs causing the FLDB to be very large or something?&lt;/p&gt;</comment>
                            <comment id="116936" author="di.wang" created="Fri, 29 May 2015 22:06:46 +0000"  >&lt;p&gt;Hmm, that is true, for striped directory case, the FID on OST object  are  real FIDs.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;res: [0x138000040b:0xe2:0x0].0
res: [0x1340000404:0xe2:0x0].0
.......
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For non-striped directory case, the FID are IDIF&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[0x2:0x0:0x0].0
.....
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But there are FLD lookup in the debug log.  It might be interesting to see what happens on OST for real FID.  &lt;/p&gt;

&lt;p&gt;James: An easy way to verify this  would be create a remote directory on MDTn (n &amp;#33;= 0), then create the file under the remote directory, which will allocate regular FIDs for the file, to see if the problem can be reproduced. And if you can get OST debug log for me, which will be helpful. Thanks!&lt;/p&gt;

</comment>
                            <comment id="116938" author="di.wang" created="Fri, 29 May 2015 22:21:30 +0000"  >&lt;p&gt;Andreas: I just check the debug log, it seems only the first read RPC send to the each OST are slow. So maybe you are right, it is because of locating the real FID object is  slow on OST. Hmm.&lt;/p&gt;</comment>
                            <comment id="116943" author="adilger" created="Fri, 29 May 2015 23:01:28 +0000"  >&lt;p&gt;Di, maybe something about loading the O/&lt;/p&gt;
{seq}
&lt;p&gt; object directories is doing too much work?&lt;/p&gt;</comment>
                            <comment id="116947" author="di.wang" created="Fri, 29 May 2015 23:34:16 +0000"  >&lt;p&gt;Andreas: hmm, loading a new sequence does need extra work, which includes about 33 directory lookup and 32 xatt_set. But I assume OST is not busy during the test (James, please confirm ?), so it may not cost about 28 seconds. Let&apos;s see what debug log say. &lt;/p&gt;</comment>
                            <comment id="116950" author="simmonsja" created="Sat, 30 May 2015 00:04:49 +0000"  >&lt;p&gt;The only activating on the OSTs is the retrieve of the simul executable file raw data to calculate the md5sum. So it is quiet system at the time.&lt;/p&gt;</comment>
                            <comment id="117063" author="di.wang" created="Mon, 1 Jun 2015 16:12:07 +0000"  >&lt;p&gt;James: Any news on the OST side debug log? Thanks!&lt;/p&gt;</comment>
                            <comment id="117064" author="simmonsja" created="Mon, 1 Jun 2015 16:28:54 +0000"  >&lt;p&gt;I just finished the DNE1 directory test. I created a empty file in the DNE1 directory using MDS 4 and then remounted lustre on the client. Then I attempted an md5sum on the new empty file which after 3 minutes never finished so I stop the md5sum command and did a log dump. I pushed the log to ftp.whamcloud.com/uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;-dne1.log. Now I&apos;m moving on to getting the OSS logs.&lt;/p&gt;

&lt;p&gt;Also as a side note I found unmount the lustre client did not finish as well.I had to reboot the node.&lt;/p&gt;</comment>
                            <comment id="117069" author="simmonsja" created="Mon, 1 Jun 2015 17:17:33 +0000"  >&lt;p&gt;I&apos;m seeing this error on the OSS.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@sultan-oss1 ~&amp;#93;&lt;/span&gt;# dmesg&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;246246.570179&amp;#93;&lt;/span&gt; LustreError: 23969:0:(events.c:447:server_bulk_callback()) event type 5, status -38, desc ffff880f43554000&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;246289.576691&amp;#93;&lt;/span&gt; LNet: 23977:0:(o2iblnd_cb.c:410:kiblnd_handle_rx()) PUT_NACK from 10.37.202.61@o2ib1&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;246289.585754&amp;#93;&lt;/span&gt; LNet: 23977:0:(o2iblnd_cb.c:410:kiblnd_handle_rx()) Skipped 10 previous similar messages&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;246289.605663&amp;#93;&lt;/span&gt; LustreError: 23977:0:(events.c:447:server_bulk_callback()) event type 5, status -38, desc ffff880f4a522000&lt;/p&gt;

&lt;p&gt;I uploaded the OSS logs to ftp.whamcloud.com/uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;-OSS.log&lt;/p&gt;</comment>
                            <comment id="117107" author="di.wang" created="Mon, 1 Jun 2015 20:56:21 +0000"  >&lt;p&gt;Hmm, I found this in the debug log&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00010000:00000001:10.0:1433178825.894370:0:28309:0:(ldlm_lib.c:2683:target_bulk_timeout()) Process entered
00010000:00000001:10.0:1433178825.894375:0:28309:0:(ldlm_lib.c:2687:target_bulk_timeout()) Process leaving (rc=1 : 1 : 1)
00010000:00020000:10.0:1433178825.894381:0:28309:0:(ldlm_lib.c:2771:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff880ff94283c0 x1502435549823124/t0(0) o3-&amp;gt;b9cf5051-0ff9-6cf9-cd67-9364a2516176@30@gni1:86/0 lens 488/432 e 0 to 0 dl 1433178836 ref 1 fl Interpret:/2/0 rc 0/0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;anything special about connection between sultan-OST0034 and 30@gni1  ?&lt;/p&gt;</comment>
                            <comment id="117119" author="di.wang" created="Tue, 2 Jun 2015 01:15:09 +0000"  >&lt;p&gt;Unfortunately, except the timeout bulk IO failures, I can not see other problem from the debug log. So it is quite possible this is caused by some problem in LNET. Hmm, DNE2 patches did not touch some of LNET code in this patch&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/#/c/12525&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/12525&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So it maybe related to this change.  Unfortunately, there are no correspondent client debug log when timeout bulkio happens on the OST side.&lt;br/&gt;
&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;-dne1.log is too earlier. &lt;/p&gt;

&lt;p&gt;James: if this is repeatable on your environment, could you please get the client side log when timeout happens on the server side? &lt;/p&gt;</comment>
                            <comment id="117125" author="simmonsja" created="Tue, 2 Jun 2015 03:46:59 +0000"  >&lt;p&gt;The &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;-dne1.log is the client side log for when the OSS times out.&lt;/p&gt;</comment>
                            <comment id="117128" author="di.wang" created="Tue, 2 Jun 2015 07:10:39 +0000"  >&lt;p&gt; &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;-dne1.log seems too earlier. According to the debug log, when the timeout on OSS happens&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000020:02000400:10.0:1433178825.928974:0:28309:0:(tgt_handler.c:1834:tgt_brw_read()) sultan-OST0034: Bulk IO read error with b9cf5051-0ff9-6cf9-cd67-9364a2516176 (at 30@gni1), client will retry: rc -110
00000020:00000001:10.0:1433178825.947041:0:28309:0:(tgt_handler.c:1851:tgt_brw_read()) Process leaving (rc=18446744073709551506 : -110 : ffffffffffffff92)
00010000:00000080:10.0:1433178825.947043:0:28309:0:(ldlm_lib.c:2427:target_committed_to_req()) @@@ not sending last_committed update (0/1)  req@ffff880ff94283c0 x1502435549823124/t0(0) o3-&amp;gt;b9cf5051-0ff9-6cf9-cd67-9364a2516176@30@gni1:86/0 lens 488/432 e 0 to 0 dl 1433178836 ref 1 fl Interpret:/2/ffffffff rc -110/-1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt; 

&lt;p&gt;And I can not find this 1502435549823 req on the client side. And also if time on each node are synchronized before the test, then it seems the timestamps in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;-dne1.log seems much earlier than &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;-OSS.log?&lt;/p&gt;</comment>
                            <comment id="117158" author="simmonsja" created="Tue, 2 Jun 2015 15:32:28 +0000"  >&lt;p&gt;Yes I see what you mean. I did the same test twice. Collected logs from the client first then the OSS. I guess you want me to collect data from both at the same time.&lt;/p&gt;</comment>
                            <comment id="117164" author="di.wang" created="Tue, 2 Jun 2015 16:55:08 +0000"  >&lt;p&gt;Yes, please. I want to see if this is related with the change in &lt;a href=&quot;http://review.whamcloud.com/#/c/12525&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/12525&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="117227" author="doug" created="Tue, 2 Jun 2015 22:31:12 +0000"  >&lt;p&gt;James: Is this system running GNI?  I&apos;m wondering if there is something in patch &lt;a href=&quot;http://review.whamcloud.com/#/c/12525&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/12525&lt;/a&gt; which is not compatible with GNI operation.  Is GNI using rdma reads now for bulk transfers (IB only uses rdma writes)?&lt;/p&gt;</comment>
                            <comment id="117236" author="simmonsja" created="Wed, 3 Jun 2015 00:13:37 +0000"  >&lt;p&gt;The test client and server are both infiniband. Originally I tested on both GNI and o2ib but when I saw problems I moved testing only inifinband to make sure it wasn&apos;t a GNI issue. All the logs I have posted here are from an infiniband only systems.&lt;/p&gt;</comment>
                            <comment id="117237" author="di.wang" created="Wed, 3 Jun 2015 01:01:22 +0000"  >&lt;p&gt;James: but the request which caused OST timeout seems from GNI interface.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000020:02000400:10.0:1433178825.928974:0:28309:0:(tgt_handler.c:1834:tgt_brw_read()) sultan-OST0034: Bulk IO read error with b9cf5051-0ff9-6cf9-cd67-9364a2516176 (at 30@gni1), client will retry: rc -110
00000020:00000001:10.0:1433178825.947041:0:28309:0:(tgt_handler.c:1851:tgt_brw_read()) Process leaving (rc=18446744073709551506 : -110 : ffffffffffffff92)
00010000:00000080:10.0:1433178825.947043:0:28309:0:(ldlm_lib.c:2427:target_committed_to_req()) @@@ not sending last_committed update (0/1)  req@ffff880ff94283c0 x1502435549823124/t0(0) o3-&amp;gt;b9cf5051-0ff9-6cf9-cd67-9364a2516176@30@gni1:86/0 lens 488/432 e 0 to 0 dl 1433178836 ref 1 fl Interpret:/2/ffffffff rc -110/-1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;nid is 30@gni1&lt;/p&gt;</comment>
                            <comment id="117307" author="simmonsja" created="Wed, 3 Jun 2015 17:06:36 +0000"  >&lt;p&gt;Ah yes one of the service nodes failed to unmount. This time we killed off the node so a clean run was done and the same problem still exist. I uploaded both the client and OSS logs to ftp.whamcloud.com/uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;/&lt;span class=&quot;error&quot;&gt;&amp;#91;client|oss&amp;#93;&lt;/span&gt;-dump-june-3.log. No GNI noise in the logs this time.&lt;/p&gt;</comment>
                            <comment id="117318" author="simmonsja" created="Wed, 3 Jun 2015 17:50:03 +0000"  >&lt;p&gt;Your ftp site is down. The OSS log transfer is not complete.&lt;/p&gt;</comment>
                            <comment id="117346" author="di.wang" created="Wed, 3 Jun 2015 19:59:10 +0000"  >&lt;p&gt;I think it works again. Hmm, I saw two files there&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;-rw-r--r-- 1 nobody ftp  34M Jun  3 09:58 client-dump-june-3.log
-rw-r--r-- 1 nobody ftp 485M Jun  3 10:16 oss-dump-june-3.log
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let me download to see what are there. You can keep uploading the OSS log. Thanks.&lt;/p&gt;</comment>
                            <comment id="117352" author="simmonsja" created="Wed, 3 Jun 2015 20:36:49 +0000"  >&lt;p&gt;That looks about right for the logs. A few MB might be missing from the OSS log but that should be okay.&lt;/p&gt;</comment>
                            <comment id="117357" author="di.wang" created="Wed, 3 Jun 2015 21:22:30 +0000"  >&lt;p&gt;James: what are the OSS and client (where you collected the debug log) IP addresses? And what is stripe layout of this file being read? Could you please post the stripe information here? Thanks.&lt;/p&gt;</comment>
                            <comment id="117362" author="simmonsja" created="Wed, 3 Jun 2015 21:58:48 +0000"  >&lt;p&gt;Client : 10.37.248.124@o2ib1&lt;br/&gt;
OSS : 10.37.248.69@o2ib1&lt;/p&gt;

&lt;p&gt;As you can tell the file is using MDS 4&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@ninja06 ~&amp;#93;&lt;/span&gt;# lfs getstripe /lustre/sultan/stf008/scratch/jsimmons/test_mdt4/#test-dir.0/file.1 &lt;br/&gt;
/lustre/sultan/stf008/scratch/jsimmons/test_mdt4/#test-dir.0/file.1&lt;br/&gt;
lmm_stripe_count:   4&lt;br/&gt;
lmm_stripe_size:    1048576&lt;br/&gt;
lmm_pattern:        1&lt;br/&gt;
lmm_layout_gen:     0&lt;br/&gt;
lmm_stripe_offset:  53&lt;br/&gt;
        obdidx           objid           objid           group&lt;br/&gt;
            53             354          0x162     0x134000040e&lt;br/&gt;
            54             354          0x162     0x138000040a&lt;br/&gt;
            55             354          0x162     0x13c000040a&lt;br/&gt;
             0             354          0x162      0x600000407&lt;/p&gt;</comment>
                            <comment id="117370" author="di.wang" created="Thu, 4 Jun 2015 00:28:52 +0000"  >&lt;p&gt;According to the debug log, the file_read thread seems waiting there to get some pages&lt;/p&gt;

&lt;p&gt;The debug log start from 1433350220.121884, and the file_thread show up until 1433350251.100439&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000020:00000001:2.0:1433350251.100439:0:22229:0:(cl_io.c:866:cl_page_list_add()) Process entered
...
00000080:00000001:2.0:1433350251.128153:0:22229:0:(file.c:1353:ll_file_read()) Process leaving (rc=4194304 : 4194304 : 400000)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And before read thread show up (from 1433350220.121884 to 1433350251.100439), it seems ptlrpc threads are reading pages from different stripes, and it issued 4 X 1M RPC to OST0000, 8 X 1M RPC to OST0036, 8 X 1M RPC to OST0037 and 7 X 1M RPC to OST0035.  So it is probably read-ahead went badly, and&lt;br/&gt;
read thread does not get the page needed, so wait for long time. Unfortunately, there are no enough log to find out the real reason. &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/sad.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; &lt;/p&gt;

&lt;p&gt;And I also checked the debug log on the OSS side, it seems the longest RPC handled time is 43902us, which is still far less than 1 second&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:00100000:9.0:1433350292.704085:0:24865:0:(service.c:2125:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io02_011:18eebf31-d253-c2e2-a426-9fd09c02ffc4+9:9454:x1502877014836248:12345-10.37.248.130@o2ib1:3 Request procesed in 43821us (43902us total) trans 0 rc 1048576/1048576
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt; 
&lt;p&gt;So OSS seems fine this time.&lt;/p&gt;

&lt;p&gt;James: Could you please run the test without trace debug level on the client side, so we can get more log to see what happens on the client side? Anyway I still can not figure out why this only happens under striped directory. &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/sad.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;br/&gt;
lctl set_param debug=-1&lt;br/&gt;
lctl set_param debug=-trace&lt;/p&gt;</comment>
                            <comment id="117375" author="simmonsja" created="Thu, 4 Jun 2015 01:09:49 +0000"  >&lt;p&gt;I placed new log file client-dump-june-3-no-trace.log which has trace disabled in the usual spot.&lt;/p&gt;</comment>
                            <comment id="117387" author="di.wang" created="Thu, 4 Jun 2015 04:54:43 +0000"  >&lt;p&gt;James: Thanks. But the read RPCs are missing in the debug log, or it is grabbed too late? I did not see any read RPC in the debug log.&lt;br/&gt;
Did you set debug log to be &lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl set_param debug=all-trace
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The file.1 seems on MDT0007, is this the same file as MDT4 (stripe_count = 4)? And the size is more than 7G?&lt;/p&gt;

&lt;p&gt;and there seems a dead router in your system, and I am not sure how it will impact the performance.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000800:00000100:3.0:1433379860.351507:0:2845:0:(o2iblnd_cb.c:2108:kiblnd_peer_connect_failed()) Deleting messages for 10.37.202.59@o2ib1: connection failed

00000400:00000200:2.0:1433379885.351487:0:20110:0:(router.c:1018:lnet_ping_router_locked()) rtr 10.37.202.59@o2ib1 50: deadline 4742051025 ping_notsent 1 alive 0 alive_count 1 lp_ping_timestamp 4742001025
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


</comment>
                            <comment id="117437" author="simmonsja" created="Thu, 4 Jun 2015 14:46:51 +0000"  >&lt;p&gt;That is not right. file.1 is an empty file in a DNE1 directory. Yes its OST stripe count is 4. I have had that dead router for ages :-/ Yes the debug is set as you asked and I ran the debug collection right after I started the md5sum on file.1.&lt;/p&gt;</comment>
                            <comment id="117439" author="simmonsja" created="Thu, 4 Jun 2015 14:57:13 +0000"  >&lt;p&gt;Okay, I have no idea where that data came from, but file.1 is supposed to be empty.&lt;/p&gt;</comment>
                            <comment id="117470" author="di.wang" created="Thu, 4 Jun 2015 17:48:32 +0000"  >&lt;p&gt;Hmm, interesting. The data are still there? or it is just the size of the file.1 is 7G. I definitely see a lot file read&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000080:00200000:0.0:1433379889.075253:0:2403:0:(file.c:1088:ll_file_io_generic()) file: file.1, type: 0 ppos: 7876902912, count: 4194304
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt; 

&lt;p&gt;And also on the OST side, I saw there are real data for the object&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000001:00001000:13.0:1433350227.207891:0:3039:0:(osd_io.c:830:osd_ldiskfs_map_ext_inode_pages()) inode 4800: map 256 pages from 37376
00000001:00001000:13.0:1433350227.207893:0:3039:0:(osd_io.c:774:osd_ldiskfs_map_nblocks()) blocks 37376-37631 requested for inode 4800
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;inode 4800 should be the object (0 354 0x162 0x600000407)on OST0000. hmm&lt;/p&gt;

</comment>
                            <comment id="117484" author="di.wang" created="Thu, 4 Jun 2015 18:36:57 +0000"  >&lt;p&gt;James: could you please mount OST0000 as ldiskfs to see what are under /mnt/ost0/O/600000407 to see if there is file whose inode number is 4800, and stat that file?&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;find /mnt/ost0/O/600000407 | xargs ls -i  | grep 4800

stat that file.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I want to know if the OI mapping is correctly for FID on OST, Thanks.&lt;/p&gt;
</comment>
                            <comment id="117488" author="simmonsja" created="Thu, 4 Jun 2015 19:30:12 +0000"  >&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@sultan-oss1 d2&amp;#93;&lt;/span&gt;# pwd&lt;br/&gt;
/mnt/O/600000407/d2&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@sultan-oss1 d2&amp;#93;&lt;/span&gt;# stat 162&lt;br/&gt;
  File: `162&apos;&lt;br/&gt;
  Size: 4125097984      Blocks: 8056840    IO Block: 4096   regular file&lt;br/&gt;
Device: fd07h/64775d    Inode: 4800        Links: 1&lt;br/&gt;
Access: (0666/&lt;del&gt;rw-rw-rw&lt;/del&gt;)  Uid: (    0/    root)   Gid: (    0/    root)&lt;br/&gt;
Access: 1969-12-31 19:00:00.000000000 -0500&lt;br/&gt;
Modify: 2015-05-29 21:23:35.000000000 -0400&lt;br/&gt;
Change: 2015-05-29 21:23:35.000000000 -0400&lt;/p&gt;</comment>
                            <comment id="117491" author="di.wang" created="Thu, 4 Jun 2015 19:37:34 +0000"  >&lt;p&gt;Jame: Thanks. I will check if somewhere in DNE2 patches can fill an empty object in OST? very strange. In the mean time, could you please check if this is repeatable on your environment? i.e. create an empty file on a remote directory, then do md5sum to see if an empty object will be filled. Are there anything else you did for file.1?&lt;/p&gt;</comment>
                            <comment id="117494" author="di.wang" created="Thu, 4 Jun 2015 20:44:34 +0000"  >&lt;p&gt;James: is this new formatted system? or upgraded? from 2.4? btw: the data under file.1 is garbage or likes data of other file?&lt;/p&gt;</comment>
                            <comment id="117507" author="simmonsja" created="Thu, 4 Jun 2015 22:05:42 +0000"  >&lt;p&gt;It is a 2.5 formatted file system. file.1 looks like garbage. Now I could of did something to file file.1 :-/ I&apos;m going to try our cray system tomorrow with a vanilla 2.7.54 to see if the problem still exist.&lt;/p&gt;</comment>
                            <comment id="117520" author="simmonsja" created="Fri, 5 Jun 2015 00:07:49 +0000"  >&lt;p&gt;Can you list what the remaining patches are for DNE2 and what order they need to be applied in. Thanks.&lt;/p&gt;</comment>
                            <comment id="117610" author="simmonsja" created="Fri, 5 Jun 2015 18:39:25 +0000"  >&lt;p&gt;This morning I updated to the latest vanilla master and ended up in a state where I could not mount the file system. So I tried migrating back to lustre 2.5 and when I attempted to mount the file system I got these errors:&lt;/p&gt;

&lt;p&gt;[ 1025.018232] Lustre: Lustre: Build Version: 2.5.4--CHANGED-2.6.32-431.29.2.el6.atlas.x86_64&lt;br/&gt;
[ 1062.767104] LDISKFS-fs (dm-7): recovery complete&lt;br/&gt;
[ 1062.791599] LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on. Opts:&lt;br/&gt;
[ 1063.389770] LustreError: 24686:0:(ofd_fs.c:594:ofd_server_data_init()) sultan-OST0000: unsupported read-only filesystem feature(s) 2&lt;br/&gt;
[ 1063.412582] LustreError: 24686:0:(obd_config.c:572:class_setup()) setup sultan-OST0000 failed (-22)&lt;br/&gt;
[ 1063.421818] LustreError: 24686:0:(obd_config.c:1629:class_config_llog_handler()) MGC10.37.248.67@o2ib1: cfg command failed: rc = -22&lt;br/&gt;
[ 1063.433944] Lustre:    cmd=cf003 0:sultan-OST0000  1:dev  2:0  3:f&lt;br/&gt;
[ 1063.440506] LustreError: 15b-f: MGC10.37.248.67@o2ib1: The configuration from log &apos;sultan-OST0000&apos;failed from the MGS (-22).  Make sure this client and the MGS are running compatible versions of Lustre.&lt;br/&gt;
[ 1063.458690] LustreError: 15c-8: MGC10.37.248.67@o2ib1: The configuration from log &apos;sultan-OST0000&apos; failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.&lt;br/&gt;
[ 1063.482398] LustreError: 24600:0:(obd_mount_server.c:1254:server_start_targets()) failed to start server sultan-OST0000: -22&lt;br/&gt;
[ 1063.493822] LustreError: 24600:0:(obd_mount_server.c:1737:server_fill_super()) Unable to start targets: -22&lt;br/&gt;
[ 1063.503768] LustreError: 24600:0:(obd_mount_server.c:847:lustre_disconnect_lwp()) sultan-MDT0000-lwp-OST0000: Can&apos;t end config log sultan-client.&lt;br/&gt;
[ 1063.516947] LustreError: 24600:0:(obd_mount_server.c:1422:server_put_super()) sultan-OST0000: failed to disconnect lwp. (rc=-2)&lt;br/&gt;
[ 1063.528574] LustreError: 24600:0:(obd_config.c:619:class_cleanup()) Device 3 not setup&lt;br/&gt;
[ 1063.539611] Lustre: server umount sultan-OST0000 complete&lt;br/&gt;
[ 1063.545093] LustreError: 24600:0:(obd_mount.c:1330:lustre_fill_super()) Unable to mount /dev/mapper/sultan-ddn-l0 (-22)&lt;br/&gt;
[ 1070.949382] LDISKFS-fs (dm-6): recovery complete&lt;br/&gt;
[ 1070.956045] LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on. Opts:&lt;br/&gt;
[ 1071.472962] LustreError: 24982:0:(ofd_fs.c:594:ofd_server_data_init()) sultan-OST0004: unsupported read-only filesystem feature(s) 2&lt;br/&gt;
[ 1071.495949] LustreError: 24982:0:(obd_config.c:572:class_setup()) setup sultan-OST0004 failed (-22)&lt;br/&gt;
[ 1071.505140] LustreError: 24982:0:(obd_config.c:1629:class_config_llog_handler()) MGC10.37.248.67@o2ib1: cfg command failed: rc = -22&lt;/p&gt;

&lt;p&gt;Something borked my file system. I&apos;m going to have to reformat. We really need to increase the scope to make sure going from 2.5 to 2.8 works as well as the reverse. This includes a setup of 2.5 with remote directories.&lt;/p&gt;</comment>
                            <comment id="117616" author="di.wang" created="Fri, 5 Jun 2015 19:34:06 +0000"  >&lt;p&gt;James, I just update the patch. Here is the list based on master, you can applied these patches by this order&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/#/c/14679/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/14679/&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/12825/36&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/12825/36&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/14883/5&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/14883/5&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/12282/47&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/12282/47&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/12450/45&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/12450/45&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/13785/24&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/13785/24&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/13786/25&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/13786/25&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/15161/1&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/15161/1&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/15162/1&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/15162/1&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/15163/1&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/15163/1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;will be the last one for DNE2 patches, there are a few more for fixing the racer, but probably not needed for current test.&lt;/p&gt;

&lt;p&gt;Sure I will add this upgrade test(2.5 with remote dir upgrade to 2.8) to cont-sanity 32c. And I will try downgrade manually.&lt;/p&gt;</comment>
                            <comment id="117732" author="simmonsja" created="Mon, 8 Jun 2015 15:07:29 +0000"  >&lt;p&gt;I reformatted the file system and the problems went away, so it was file system corruption that caused this. I know how this happened &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; Originally my file system was build under Lustre 2.5 and I created a bunch of DNE1 directories for testing. When I booted to Lustre pre-2.8 I created a new DNE2 scratch workspace and move the contents of my old scratch space there. This included the batch of DNE1 directories I created. This totally hosed the file system. I bet the reverse is true as well, i.e placing a DNE2 directory under a DNE1 directory will cause mayhem as well.&lt;/p&gt;</comment>
                            <comment id="117758" author="di.wang" created="Mon, 8 Jun 2015 17:12:17 +0000"  >&lt;p&gt;James: Thanks. I will add this process to cont-sanity.sh 32c and check.&lt;/p&gt;</comment>
                            <comment id="117762" author="simmonsja" created="Mon, 8 Jun 2015 17:34:49 +0000"  >&lt;p&gt;Before we declare victory I moved my testing to our Cray system and the problem is still there. This time now that the file system corruption is gone we can get a better idea of what is going on. First I tried with the patch from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5385&quot; title=&quot;HSM: do not call the JSON log functions if no log is open&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5385&quot;&gt;&lt;del&gt;LU-5385&lt;/del&gt;&lt;/a&gt; and without and it was the same result. The Cray node handing attempting a md5sum simul. No errors client side but I do see errors on the OSS.&lt;/p&gt;

&lt;p&gt;[ 5064.214349] LNet: 21859:0:(o2iblnd_cb.c:411:kiblnd_handle_rx()) PUT_NACK from 10.37.202.60@o2ib1&lt;br/&gt;
[ 5064.223501] LustreError: 21859:0:(events.c:447:server_bulk_callback()) event type 5, status -38, desc ffff88103430e000&lt;br/&gt;
[ 5164.196241] LustreError: 22947:0:(ldlm_lib.c:3056:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff8810407fd080 x1503431681379072/t0(0) o3-&amp;gt;dc99049c-a5df-abb6-3de7-890b86319cdf@30@gni1:281/0 lens 608/432 e 4 to 0 dl 1433784541 ref 1 fl Interpret:/0/0 rc 0/0&lt;br/&gt;
[ 5164.220805] Lustre: sultan-OST0000: Bulk IO read error with dc99049c-a5df-abb6-3de7-890b86319cdf (at 30@gni1), client will retry: rc -110&lt;br/&gt;
[ 5175.564732] Lustre: sultan-OST0000: Client dc99049c-a5df-abb6-3de7-890b86319cdf (at 30@gni1) reconnecting&lt;br/&gt;
[ 5175.574451] Lustre: Skipped 1 previous similar message&lt;br/&gt;
[ 5175.579726] Lustre: sultan-OST0000: Connection restored to dc99049c-a5df-abb6-3de7-890b86319cdf (at 30@gni1)&lt;br/&gt;
[ 5175.589709] Lustre: Skipped 272 previous similar messages&lt;br/&gt;
[ 5175.595765] LNet: 21867:0:(o2iblnd_cb.c:411:kiblnd_handle_rx()) PUT_NACK from 10.37.202.61@o2ib1&lt;br/&gt;
[ 5175.605076] LustreError: 21867:0:(events.c:447:server_bulk_callback()) event type 5, status -38, desc ffff8810340cc000&lt;/p&gt;

&lt;p&gt;I pushed logs to ftp.whamcloud.com/uoloads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6663&quot; title=&quot;DNE2 directories has very very bad performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6663&quot;&gt;&lt;del&gt;LU-6663&lt;/del&gt;&lt;/a&gt;/june-8-&lt;span class=&quot;error&quot;&gt;&amp;#91;client|oss&amp;#93;&lt;/span&gt;.log for debugging. Perf shows it spinning on a spin lock on the OSS side. One last note of interest. The DNE2 directory doesn&apos;t have this problem. Only ordinary directories.&lt;/p&gt;</comment>
                            <comment id="117776" author="di.wang" created="Mon, 8 Jun 2015 18:40:28 +0000"  >&lt;p&gt;I checked the debug log, it seems this slows still happens between o2ib and gni.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000400:00000010:4.0:1433777204.396986:0:5507:0:(lib-lnet.h:240:lnet_md_free()) kfreed &apos;md&apos;: 128 at ffff8803ce5c81c0 (tot 381648128).
00000400:00000010:4.0:1433777204.396988:0:5507:0:(lib-lnet.h:273:lnet_msg_free()) kfreed &apos;msg&apos;: 352 at ffff8803cf242a00 (tot 381647776).
00000800:00004000:4.0:1433777204.396989:0:5507:0:(gnilnd_cb.c:3445:kgnilnd_check_fma_send_cq()) conn-&amp;gt;gnc_tx_in_use refcount 0
00000800:00004000:4.0:1433777204.396990:0:5507:0:(gnilnd_cb.c:3446:kgnilnd_check_fma_send_cq()) conn ffff8803ea1b3800-&amp;gt;27@gni1-- (3)
00000800:00000040:4.0:1433777204.396992:0:5507:0:(gnilnd_cb.c:3328:kgnilnd_check_fma_send_cq()) SMSG send CQ 0 not ready (data 0x140002253101005) processed 1
00000800:00000040:4.0:1433777204.396993:0:5507:0:(gnilnd_cb.c:3481:kgnilnd_check_fma_rcv_cq()) SMSG RX CQ 0 empty data 0x0 processed 0
00000800:00000040:4.0:1433777204.396994:0:5507:0:(gnilnd_cb.c:3170:kgnilnd_check_rdma_cq()) SEND RDMA CQ 0 empty processed 0
00000800:00000040:4.0:1433777204.396995:0:5507:0:(gnilnd_cb.c:3328:kgnilnd_check_fma_send_cq()) SMSG send CQ 0 not ready (data 0xffffffffa03ce1f9) processed 0
00000800:00000040:4.0:1433777204.396996:0:5507:0:(gnilnd_cb.c:3481:kgnilnd_check_fma_rcv_cq()) SMSG RX CQ 0 empty data 0xffffffffa040d230 processed 0
00000800:00000040:4.0:1433777204.396997:0:5507:0:(gnilnd_cb.c:3170:kgnilnd_check_rdma_cq()) SEND RDMA CQ 0 empty processed 0
00000800:00000040:4.0:1433777204.396999:0:5507:0:(gnilnd_cb.c:5080:kgnilnd_scheduler()) scheduling: found_work 0 busy_loops 75
00000800:00000040:4.0:1433777204.397000:0:5506:0:(gnilnd_cb.c:5084:kgnilnd_scheduler()) awake after schedule
00000800:00000040:4.0:1433777204.397001:0:5506:0:(gnilnd_cb.c:3328:kgnilnd_check_fma_send_cq()) SMSG send CQ 0 not ready (data 0x0) processed 0
00000800:00000040:4.0:1433777204.397002:0:5506:0:(gnilnd_cb.c:3481:kgnilnd_check_fma_rcv_cq()) SMSG RX CQ 0 empty data 0x0 processed 0
00000800:00000040:4.0:1433777204.397003:0:5506:0:(gnilnd_cb.c:3170:kgnilnd_check_rdma_cq()) SEND RDMA CQ 0 empty processed 0
00000800:00000040:4.0:1433777204.397004:0:5506:0:(gnilnd_cb.c:5080:kgnilnd_scheduler()) scheduling: found_work 0 busy_loops 76
00000100:00000400:3.0:1433777204.401048:0:5521:0:(client.c:2003:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1433777092/real 1433777092]  req@ffff8803e9ad99c0 x1503423118714204/t0(0) o3-&amp;gt;sultan-OST0000-osc-ffff8803f36ce800@10.37.248.69@o2ib1:6/4 lens 608/432 e 4 to 1 dl 1433777204 ref 2 fl Rpc:X/2/ffffffff rc 0/-1
00000100:00000200:3.0:1433777204.401055:0:5521:0:(events.c:97:reply_in_callback()) @@@ type 6, status 0  req@ffff8803e9ad99c0 x1503423118714204/t0(0) o3-&amp;gt;sultan-OST0000-osc-ffff8803f36ce800@10.37.248.69@o2ib1:6/4 lens 608/432 e 4 to 1 dl 1433777204 ref 2 fl Rpc:X/2/ffffffff rc 0/-1
00000100:00000200:3.0:1433777204.401060:0:5521:0:(events.c:118:reply_in_callback()) @@@ unlink  req@ffff8803e9ad99c0 x1503423118714204/t0(0) o3-&amp;gt;sultan-OST0000-osc-ffff8803f36ce800@10.37.248.69@o2ib1:6/4 lens 608/432 e 4 to 1 dl 1433777204 ref 2 fl Rpc:X/2/ffffffff rc 0/-1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So client is using gni interface, OST0000 is using o2ib interface.&lt;/p&gt;

&lt;p&gt;And also it seems there are network partitions on the  connection to OST0000, though I do not know the real course is network or sth else?  &lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:00080000:5.0:1433777204.401753:0:5520:0:(import.c:1170:ptlrpc_connect_interpret()) reconnected to sultan-OST0000_UUID@10.37.248.69@o2ib1 after partition
00000100:00080000:5.0:1433777204.401754:0:5520:0:(import.c:1188:ptlrpc_connect_interpret()) ffff8803ced85000 sultan-OST0000_UUID: changing import state from CONNECTING to RECOVER
00000100:00080000:5.0:1433777204.401756:0:5520:0:(import.c:1488:ptlrpc_import_recovery_state_machine()) ffff8803ced85000 sultan-OST0000_UUID: changing import state from RECOVER to FULL
00000100:02000000:5.0:1433777204.401764:0:5520:0:(import.c:1494:ptlrpc_import_recovery_state_machine()) sultan-OST0000-osc-ffff8803f36ce800: Connection restored to 10.37.248.69@o2ib1 (at 10.37.248.69@o2ib1)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="117803" author="simmonsja" created="Mon, 8 Jun 2015 21:45:45 +0000"  >&lt;p&gt;If the DNE2 directory gives no problem then it is a issue with small packets.&lt;/p&gt;</comment>
                            <comment id="118136" author="simmonsja" created="Wed, 10 Jun 2015 20:54:49 +0000"  >&lt;p&gt;I tracked down what is causing the bug. Setting map_on_demand=256 for the ko2iblnd driver triggers this bug which for some reason can only be triggered on a Cray router. Peter should I open another ticket that properly defines the problem and close this ticket?&lt;/p&gt;</comment>
                            <comment id="118154" author="yujian" created="Wed, 10 Jun 2015 23:10:47 +0000"  >&lt;p&gt;Yes, James. Peter suggested to close this one and open a new ticket for tracking the issue you found. Thank you.&lt;/p&gt;</comment>
                            <comment id="118215" author="pjones" created="Thu, 11 Jun 2015 16:16:54 +0000"  >&lt;p&gt;Yes please! &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;</comment>
                            <comment id="118329" author="adilger" created="Fri, 12 Jun 2015 07:59:02 +0000"  >&lt;p&gt;James, Di, &lt;br/&gt;
did you ever open a bug on the 2.5-&amp;gt;2.8 upgrade/downgrade problem from:&lt;br/&gt;
&lt;a href=&quot;https://jira.hpdd.intel.com/browse/LU-6663?focusedCommentId=117610&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://jira.hpdd.intel.com/browse/LU-6663?focusedCommentId=117610&lt;/a&gt;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[ 1062.791599] LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on. Opts:
[ 1063.389770] LustreError: 24686:0:(ofd_fs.c:594:ofd_server_data_init()) sultan-OST0000: unsupported read-only filesystem feature(s) 2
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The unknown read-only feature appears to be &lt;tt&gt;OBD_ROCOMPAT_IDX_IN_IDIF&lt;/tt&gt;.  That feature shouldn&apos;t automatically be enabled, and needs active participation from the administrator:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;&lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; ssize_t
ldiskfs_osd_index_in_idif_seq_write(struct file *file, &lt;span class=&quot;code-keyword&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;code-object&quot;&gt;char&lt;/span&gt; *buffer,
                                    size_t count, loff_t *off)
{
                LCONSOLE_WARN(&lt;span class=&quot;code-quote&quot;&gt;&quot;%s: OST-index in IDIF has been enabled, &quot;&lt;/span&gt;
                              &lt;span class=&quot;code-quote&quot;&gt;&quot;it cannot be reverted back.\n&quot;&lt;/span&gt;, osd_name(dev));
                &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; -EPERM;
 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;James, did you set the &lt;tt&gt;index_in_idif&lt;/tt&gt; feature in /proc, or is there a bug here that needs to be filed?  Looking at the code it doesn&apos;t appear that this flag could have been set automatically.&lt;/p&gt;
</comment>
                            <comment id="118393" author="di.wang" created="Fri, 12 Jun 2015 17:00:56 +0000"  >&lt;p&gt;Andreas: I already added 2.5 to 2.8 dne upgrade to DNE patches series. I will try this downgrade steps locally to see if it is easy to be reproduced.&lt;/p&gt;</comment>
                            <comment id="118494" author="di.wang" created="Sun, 14 Jun 2015 17:03:16 +0000"  >&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/#/c/15275/1&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/15275/1&lt;/a&gt;  DNE upgrade test cases&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="30670">LU-6724</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="19630">LU-3534</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxehr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>