<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:30:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-3090] client fails to restore connection to OSTs</title>
                <link>https://jira.whamcloud.com/browse/LU-3090</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Three 2.1.4-3chaos OSS nodes rebooted this morning.  A client login node has failed to reconnect to most of their OSTs.  I will attach complete console logs.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# aztec3 /root &amp;gt; lfs check servers | grep -v active
lsc-OST0004-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0008-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST000c-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0010-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0014-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0018-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST001c-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0020-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0024-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0028-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST002c-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0030-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0034-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0038-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0000-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST007f-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0083-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0087-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST008b-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST008f-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0093-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0097-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST009b-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST009f-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST00a3-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST00a7-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST00ab-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST00af-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST00b3-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST007b-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0130-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0134-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0138-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST013c-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0140-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0144-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0148-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST014c-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0150-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0154-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0158-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST015c-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0160-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST0164-osc-ffff880c3043a000: check error: Resource temporarily unavailable
lsc-OST012c-osc-ffff880c3043a000: check error: Resource temporarily unavailable
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Example dmesg from client:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# aztec3 /root &amp;gt; dmesg | grep OST012c
Lustre: 7485:0:(client.c:1820:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1364918372/real 1364918372] req@ffff88011258ec00 x1430682336999865/t0(0) o400-&amp;gt;lsc-OST012c-osc-ffff880c3043a000@172.19.1.121@o2ib100:28/4 lens 192/192 e 0 to 1 dl 1364918478 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: lsc-OST012c-osc-ffff880c3043a000: Connection to lsc-OST012c (at 172.19.1.121@o2ib100) was lost; in progress operations using this service will wait for recovery to complete
Lustre: 7485:0:(client.c:1820:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1364918397/real 1364918397] req@ffff8804cf6c4800 x1430682337000483/t0(0) o400-&amp;gt;lsc-OST012c-osc-ffff880c3043a000@172.19.1.121@o2ib100:28/4 lens 192/192 e 0 to 1 dl 1364918503 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 7485:0:(client.c:1820:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1364918447/real 1364918447] req@ffff880633f37c00 x1430682337002326/t0(0) o400-&amp;gt;lsc-OST012c-osc-ffff880c3043a000@172.19.1.121@o2ib100:28/4 lens 192/192 e 0 to 1 dl 1364918553 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Example dmesg for one server:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# sumom23 /root &amp;gt; dmesg | grep 172.16.66.53
Lustre: lsc-OST0156: Client ea981a59-5970-b37c-8c49-7b6886344dc2 (at 172.16.66.53@tcp) refused reconnection, still busy with 5 active RPCs
Lustre: lsc-OST0156: Bulk IO read error with ea981a59-5970-b37c-8c49-7b6886344dc2 (at 172.16.66.53@tcp), client will retry: rc -110
Lustre: lsc-OST0156: Bulk IO read error with ea981a59-5970-b37c-8c49-7b6886344dc2 (at 172.16.66.53@tcp), client will retry: rc -110
Lustre: lsc-OST0156: Bulk IO read error with ea981a59-5970-b37c-8c49-7b6886344dc2 (at 172.16.66.53@tcp), client will retry: rc -110
Lustre: lsc-OST013e: Bulk IO read error with ea981a59-5970-b37c-8c49-7b6886344dc2 (at 172.16.66.53@tcp), client will retry: rc -110
Lustre: lsc-OST0156: Client ea981a59-5970-b37c-8c49-7b6886344dc2 (at 172.16.66.53@tcp) refused reconnection, still busy with 1 active RPCs
LustreError: 23397:0:(ldlm_lib.c:2715:target_bulk_io()) @@@ Reconnect on bulk PUT  req@ffff8802fb0af000 x1430682219726723/t0(0) o3-&amp;gt;ea981a59-5970-b37c-8c49-7b6886344dc2@172.16.66.53@tcp:0/0 lens 456/400 e 1 to 0 dl 1364611876 ref 1 fl Interpret:/2/0 rc 0/0
Lustre: lsc-OST0156: Bulk IO read error with ea981a59-5970-b37c-8c49-7b6886344dc2 (at 172.16.66.53@tcp), client will retry: rc -110
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;LLNL-bug-id: TOSS-2006&lt;/p&gt;</description>
                <environment>&lt;a href=&quot;https://github.com/chaos/lustre/commits/2.1.4-3chaos&quot;&gt;https://github.com/chaos/lustre/commits/2.1.4-3chaos&lt;/a&gt;</environment>
        <key id="18202">LU-3090</key>
            <summary>client fails to restore connection to OSTs</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="nedbass">Ned Bass</reporter>
                        <labels>
                            <label>llnl</label>
                    </labels>
                <created>Tue, 2 Apr 2013 18:33:44 +0000</created>
                <updated>Wed, 13 Oct 2021 03:17:49 +0000</updated>
                            <resolved>Wed, 13 Oct 2021 03:17:49 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="55323" author="nedbass" created="Tue, 2 Apr 2013 18:48:02 +0000"  >&lt;p&gt;Attached console logs from OSS nodes sumom&lt;span class=&quot;error&quot;&gt;&amp;#91;1,12,23&amp;#93;&lt;/span&gt; and client aztec3.&lt;/p&gt;</comment>
                            <comment id="55326" author="nedbass" created="Tue, 2 Apr 2013 19:48:05 +0000"  >&lt;p&gt;From what I can gather, the &quot;Resource temporarily unavailable&quot; lfs check servers error originates from line 1073 in ptlrpc_import_delay_req():&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;1067         } &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (req-&amp;gt;rq_send_state != imp-&amp;gt;imp_state) {                     
1068                 &lt;span class=&quot;code-comment&quot;&gt;/* invalidate in progress - any requests should be drop */&lt;/span&gt;     
1069                 &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (cfs_atomic_read(&amp;amp;imp-&amp;gt;imp_inval_count) != 0) {             
1070                         DEBUG_REQ(D_ERROR, req, &lt;span class=&quot;code-quote&quot;&gt;&quot;invalidate in flight&quot;&lt;/span&gt;);       
1071                         *status = -EIO;                                        
1072                 } &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (imp-&amp;gt;imp_dlm_fake || req-&amp;gt;rq_no_delay) {            
1073                         *status = -EWOULDBLOCK;                                
1074                 } &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt; {                                                       
1075                         delay = 1;                                             
1076                 }                                                              
1077         }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="55327" author="cliffw" created="Tue, 2 Apr 2013 19:48:07 +0000"  >&lt;p&gt;Can you get stack traces from the servers and client? (echo t &amp;gt; /proc/sysrq-trigger)&lt;/p&gt;</comment>
                            <comment id="55328" author="nedbass" created="Tue, 2 Apr 2013 19:54:40 +0000"  >&lt;p&gt;/proc/fs/lustre/osc/ data for one of the disconnected OSTs.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/active
1                                                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/blocksize
cat: /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/blocksize: Resource temporarily unavailable
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/checksum_type
crc32 [adler]                                                                   
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/checksums
1                                                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/connect_flags
flags=0x8d0a3440478                                                             
write_grant                                                                     
server_lock                                                                     
version                                                                         
request_portal                                                                  
truncate_lock                                                                   
max_byte_per_rpc                                                                
early_lock_cancel                                                               
adaptive_timeouts                                                               
lru_resize                                                                      
alt_checksum_algorithm                                                          
version_recovery                                                                
full20                                                                          
64bithash                                                                       
object_max_bytes                                                                
einprogress                                                                     
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/contention_seconds
0                                                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/create_count
32                                                                              
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/cur_dirty_bytes
0                                                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/cur_grant_bytes
2097152                                                                         
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/cur_lost_grant_bytes
0                                                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/destroys_in_flight
0                                                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/filesfree
cat: /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/filesfree: Resource temporarily unavailable
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/filestotal
cat: /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/filestotal: Resource temporarily unavailable
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/grant_shrink_interval
1200                                                                            
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/import
import:                                                                         
    name: lsc-OST0004-osc-ffff880c3043a000                                      
    target: lsc-OST0004_UUID                                                    
    state: REPLAY_WAIT                                                          
    connect_flags: [write_grant, server_lock, version, request_portal, truncate_lock, max_byte_per_rpc, early_lock_cancel, adaptive_timeouts, lru_resize, alt_checksum_algorithm, version_recovery, full20, 64bithash, object_max_bytes, einprogress]
    import_flags: [replayable, pingable]                                        
    connection:                                                                 
       failover_nids: [172.19.1.101@o2ib100, 172.19.1.103@o2ib100]              
       current_connection: 172.19.1.101@o2ib100                                 
       connection_attempts: 40                                                  
       generation: 1                                                            
       in-progress_invalidations: 0                                             
    rpcs:                                                                       
       inflight: 2                                                              
       unregistering: 0                                                         
       timeouts: 40                                                             
       avg_waittime: 133771 usec                                                
    service_estimates:                                                          
       services: 45 sec                                                         
       network: 45 sec                                                          
    transactions:                                                               
       last_replay: 0                                                           
       peer_committed: 133225247867                                             
       last_checked: 133225247867                                               
    read_data_averages:                                                         
       bytes_per_rpc: 811357                                                    
       usec_per_rpc: 347201                                                     
       MB_per_sec: 2.33                                                         
    write_data_averages:                                                        
       bytes_per_rpc: 248472                                                    
       usec_per_rpc: 161136                                                     
       MB_per_sec: 1.54                                                         
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/kbytesavail
cat: /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/kbytesavail: Resource temporarily unavailable
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/kbytesfree
cat: /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/kbytesfree: Resource temporarily unavailable
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/kbytestotal
cat: /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/kbytestotal: Resource temporarily unavailable
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/lockless_truncate
0                                                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/max_create_count
20000                                                                           
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/max_dirty_mb
32                                                                              
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/max_pages_per_rpc
256                                                                             
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/max_rpcs_in_flight
8                                                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/osc_stats
snapshot_time:         1364932311.305338 (secs.usecs)                           
lockless_write_bytes            0                                               
lockless_read_bytes             0                                               
lockless_truncate               0                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/ost_conn_uuid
172.19.1.101@o2ib100                                                            
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/ost_server_uuid
lsc-OST0004_UUID        REPLAY_WAIT                                             
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/ping
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/prealloc_last_id
1                                                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/prealloc_next_id
2                                                                               
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/resend_count
10                                                                              
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/rpc_stats
snapshot_time:         1364932311.309855 (secs.usecs)                           
read RPCs in flight:  0                                                         
write RPCs in flight: 0                                                         
pending write pages:  0                                                         
pending read pages:   0                                                         
                                                                                
                        read                    write                           
pages per rpc         rpcs   % cum % |       rpcs   % cum %                     
1:                    2323   2   2   |         26  74  74                       
2:                     268   0   3   |          0   0  74                       
4:                     402   0   3   |          0   0  74                       
8:                     551   0   4   |          0   0  74                       
16:                   1097   1   5   |          0   0  74                       
32:                   2104   2   7   |          0   0  74                       
64:                   4201   4  12   |          0   0  74                       
128:                  8408   9  22   |          1   2  77                       
256:                 65943  77 100   |          8  22 100                       
                                                                                
                        read                    write                           
rpcs in flight        rpcs   % cum % |       rpcs   % cum %                     
0:                    4407   5   5   |         30  85  85                       
1:                    8018   9  14   |          5  14 100                       
2:                   10761  12  27   |          0   0 100                       
3:                   11572  13  40   |          0   0 100                       
4:                   11097  13  53   |          0   0 100                       
5:                    9525  11  64   |          0   0 100                       
6:                    7767   9  74   |          0   0 100                       
7:                   21011  24  98   |          0   0 100                       
8:                    1108   1  99   |          0   0 100                       
9:                      31   0 100   |          0   0 100                       
                                                                                
                        read                    write                           
offset                rpcs   % cum % |       rpcs   % cum %                     
0:                   66045  77  77   |         31  88  88                       
1:                     724   0  78   |          3   8  97                       
2:                     664   0  79   |          1   2 100                       
4:                     463   0  79   |          0   0 100                       
8:                     578   0  80   |          0   0 100                       
16:                   1132   1  81   |          0   0 100                       
32:                   2184   2  84   |          0   0 100                       
64:                   4456   5  89   |          0   0 100                       
128:                  9051  10 100   |          0   0 100                       
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/srpc_contexts
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/srpc_info
rpc flavor:    null                                                             
bulk flavor:   null                                                             
flags:         -                                                                
id:            -1                                                               
refcount:      1280                                                             
nctx:          1                                                                
gc internal    0                                                                
gc next        0                                                                
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/state
current_state: REPLAY_WAIT                                                      
state_history:                                                                  
 - [1364613239, CONNECTING]                                                     
 - [1364613239, REPLAY]                                                         
 - [1364613239, REPLAY_LOCKS]                                                   
 - [1364613239, REPLAY_WAIT]                                                    
 - [1364613520, CONNECTING]                                                     
 - [1364613520, DISCONN]                                                        
 - [1364613564, CONNECTING]                                                     
 - [1364613564, RECOVER]                                                        
 - [1364613564, FULL]                                                           
 - [1364916472, DISCONN]                                                        
 - [1364916472, CONNECTING]                                                     
 - [1364916472, DISCONN]                                                        
 - [1364916522, CONNECTING]                                                     
 - [1364916522, REPLAY]                                                         
 - [1364916522, REPLAY_LOCKS]                                                   
 - [1364916522, REPLAY_WAIT]                                                    
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/stats
snapshot_time             1364932311.312475 secs.usecs                          
req_waittime              231719 samples [usec] 186 25613640 30997468460 20968321741396648
req_active                231760 samples [reqs] 1 10 589912 2877292             
read_bytes                85297 samples [bytes] 4096 1048576 69206327296 66376346221150208
write_bytes               35 samples [bytes] 180 1048576 8696534 8863953230766  
ost_setattr               3 samples [usec] 407 586 1461 728069                  
ost_read                  85297 samples [usec] 378 3008009 29615267854 19288318281959918
ost_write                 35 samples [usec] 9483 942569 5639773 3086328640697   
ost_destroy               63051 samples [usec] 254 5167025 621982709 541938684985359
ost_connect               7 samples [usec] 338 166069 232175 28686879269        
ost_punch                 121 samples [usec] 260 1072093 5498095 2397196392713  
ost_statfs                1194 samples [usec] 2250 744005 15761933 3826878647205
ldlm_cancel               238 samples [usec] 186 769820 6624432 2268731713178   
obd_ping                  17144 samples [usec] 315 25613640 613286727 1108431628095511
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/timeouts
last reply : 1364916601, 0d4h21m50s ago                                         
network    : cur  45  worst  50 (at 1364611097, 3d17h13m34s ago)   1   1   1   1
portal 28  : cur  45  worst  45 (at 1364404853, 6d2h30m58s ago)  45  45  45  45 
portal 7   : cur  45  worst  45 (at 1364406302, 6d2h06m49s ago)  45   0   0   0 
portal 17  : cur  45  worst  45 (at 1364422952, 5d21h29m19s ago)  45   0   0   0
portal 6   : cur  45  worst 600 (at 1364539917, 4d12h59m54s ago)  45  45  45  45
------------------------------------- /proc/fs/lustre/osc/lsc-OST0004-osc-ffff880c3043a000/uuid
ea981a59-5970-b37c-8c49-7b6886344dc2        
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="55333" author="nedbass" created="Tue, 2 Apr 2013 21:04:46 +0000"  >&lt;p&gt;Attach process backtraces for client (sysrq-t and sysrq-l).  One server crashed during sysrq-l so I&apos;m not going to try the others.  Client is probably more relevant anyways since it seems to be stuck in REPLAY_WAIT, blissfully unaware that it was evicted.&lt;/p&gt;</comment>
                            <comment id="55338" author="nedbass" created="Tue, 2 Apr 2013 21:29:42 +0000"  >&lt;p&gt;Attaching &lt;span class=&quot;nobr&quot;&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/attachment/12475/12475_sysrq-t.sumom1&quot; title=&quot;sysrq-t.sumom1 attached to LU-3090&quot;&gt;sysrq-t.sumom1&lt;sup&gt;&lt;img class=&quot;rendericon&quot; src=&quot;https://jira.whamcloud.com/images/icons/link_attachment_7.gif&quot; height=&quot;7&quot; width=&quot;7&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/sup&gt;&lt;/a&gt;&lt;/span&gt; log for one of the servers.  This is what was captured before the server crashed.&lt;/p&gt;</comment>
                            <comment id="55339" author="pjones" created="Tue, 2 Apr 2013 21:36:21 +0000"  >&lt;p&gt;Lai&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="55740" author="laisiyao" created="Mon, 8 Apr 2013 14:41:13 +0000"  >&lt;p&gt;The analysis looks reasonable: the client seems to stuck in REPLAY_WAIT. I&apos;m wondering whether it&apos;s stuck there, or in a loop trying to connect, could you confirm this?&lt;/p&gt;

&lt;p&gt;I haven&apos;t found the code that will cause the stuck issue, and will review related code. &lt;/p&gt;</comment>
                            <comment id="58927" author="nedbass" created="Mon, 20 May 2013 20:11:22 +0000"  >&lt;p&gt;Lai, sorry but the client node crashed before I was able to confirm whether it was stuck or in a loop.&lt;/p&gt;</comment>
                            <comment id="67506" author="mdiep" created="Wed, 25 Sep 2013 05:19:09 +0000"  >&lt;p&gt;this is similar to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3998&quot; title=&quot;ll_ost_io process hung after mounting 1.8.9 client on 2.4.1 server (after upgraded)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3998&quot;&gt;&lt;del&gt;LU-3998&lt;/del&gt;&lt;/a&gt;. Please let me know how I can assist in debugging this.&lt;/p&gt;</comment>
                            <comment id="67507" author="mdiep" created="Wed, 25 Sep 2013 06:18:32 +0000"  >&lt;p&gt;Hi Ned,&lt;/p&gt;

&lt;p&gt;Does your cluster have any lustre router?&lt;/p&gt;</comment>
                            <comment id="87646" author="morrone" created="Thu, 26 Jun 2014 23:42:41 +0000"  >&lt;p&gt;Yes, we have lots of Lustre routers.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="12470" name="console.aztec3" size="68576" author="nedbass" created="Tue, 2 Apr 2013 18:48:02 +0000"/>
                            <attachment id="12471" name="console.sumom1" size="232565" author="nedbass" created="Tue, 2 Apr 2013 18:48:02 +0000"/>
                            <attachment id="12472" name="console.sumom12" size="38513" author="nedbass" created="Tue, 2 Apr 2013 18:48:02 +0000"/>
                            <attachment id="12473" name="console.sumom23" size="20412" author="nedbass" created="Tue, 2 Apr 2013 18:48:02 +0000"/>
                            <attachment id="12474" name="sysrq-t.aztec3" size="1307360" author="nedbass" created="Tue, 2 Apr 2013 21:04:46 +0000"/>
                            <attachment id="12475" name="sysrq-t.sumom1" size="1712245" author="nedbass" created="Tue, 2 Apr 2013 21:29:42 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10490" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>End date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 26 Jun 2014 18:33:44 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvmwn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7507</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10493" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>Start date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 2 Apr 2013 18:33:44 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    </customfields>
    </item>
</channel>
</rss>