[LU-953] OST connection lost Created: 29/Dec/11 Updated: 12/Sep/13 Resolved: 12/Sep/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.2.0, Lustre 1.8.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | Liang Zhen (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
lustre-1.8.6.81 |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 7037 |
| Description |
|
Upgrading to lustre 1.8.6 and OFED1.5.3.1 we have started to see OST<->MDT connection issue. === ERROR ON MDS === |
| Comments |
| Comment by Peter Jones [ 29/Dec/11 ] |
|
Lsi Can you please comment on this issue? Thanks Peter |
| Comment by Cliff White (Inactive) [ 03/Jan/12 ] |
|
We need to have more information. Can you please attach the MDS system log for the 5 hours prior to this event, and the system logs from the node providing nid 10.151.25.157@o2ib for the same time period. |
| Comment by James A Simmons [ 23/Mar/12 ] |
|
I also saw the problem Lustre pre-2.2 with OFED 1.5.3 on RHEL5. Upgrading to OFED 1.5.4 made the problem go away for me. Also Lustre pre-2.2 without OFED on rhle6 shows this error but it appears to be a minor probelm. Have you seen sever problems with this? |
| Comment by James A Simmons [ 27/Mar/12 ] |
|
Okay I just moved to OFED 1.5.4.1 on rhel6 and I still see this issue. |
| Comment by James A Simmons [ 27/Mar/12 ] |
|
Also I want to comment that this is affecting the stripe placement on our OSSs. For example on our test file system I set the stripe count to 28 which is the total number of OSTs I have, each OSS has 7 OSTs. Doing a lfs getstripe on a file in this case yields testfile.out.00000009 All those object creates happen on one OSS. This is the case for all files. Mohmoud can you verify that you are seeing this behavior as well. Because of this the best performance I get for writing out a file is 250 MB/s versus the 2.5GB/s I got before. This is a blocker for developing in a production environment. |
| Comment by Ian Colle (Inactive) [ 05/Apr/12 ] |
|
Also observed in Lustre 2.2 at ORNL. |
| Comment by Peter Jones [ 05/Apr/12 ] |
|
Liang Could you please help with this one? Thanks Peter |
| Comment by Liang Zhen (Inactive) [ 06/Apr/12 ] |
|
Did OFED complained anything while you saw this issue? Also, could you please turn on neterror print so we can check whether there is any LNet/o2iblnd problem (echo +1 > /proc/sys/lnet/printk). Liang |
| Comment by James A Simmons [ 10/Apr/12 ] |
|
Mahmoud can you try the following patch against your 1.8 source. http://review.whamcloud.com/#change,1797 For me it seems to have helped. Let me know if it helps with yoru problem as well. I will be doing more testing on my side. I still have the strange striping pattern tho. |
| Comment by James A Simmons [ 11/Apr/12 ] |
|
Managed to collect logs for this problem and hand them off to Oleg. |
| Comment by Oleg Drokin [ 11/Apr/12 ] |
|
Looking at the ORNL logs from yesterday I see that MDS is constantly trying to connect to OST0001 at an address of oss1 (I assume, becaue that's where the connections end up at reported as "no such OST here"). |
| Comment by James A Simmons [ 12/Apr/12 ] |
|
I attached my build scripts to see if it is indeed a config error. I will test with the llmount.sh script as well. |
| Comment by James A Simmons [ 12/Apr/12 ] |
|
Mahmoud do you format your OSTs with --index="some number". We do that at the lab to allow parallel mounting of the OSTs.It appears to be causing problems. I'm going to do a test format without using the index to see if we still have the problems. |
| Comment by Mahmoud Hanafi [ 18/Apr/12 ] |
|
Sorry for the late reply. |
| Comment by James A Simmons [ 19/Apr/12 ] |
|
No problem about the delay. I have done some tracking down of the problem and discovered how to replicate this issue. The problem only shows up when formatting the OST with index="number". Whats causes the problem is a mounting order. If you mount MGS > MDS > OSS(s) no problems will show up. If you mount MGS > OSS(s) > MDS then you will experience this problem. Now here is a extra bit of info. If you format with OST index and you mount in the MGS > MDS > OSS order then umount the file system then remount in the order of MGS > OSS(s) > MDS you will not run into the connection problem. This tells you the problem is the wrong data being written to the llog on the MDS. For some reason the data sent by the OSS to the OSC layer on the MDS is different if the MDS sents a signal to the OSS to send it's configuration data versus the OSS successfully sending its configuration data to MDS already available. |
| Comment by James A Simmons [ 19/Apr/12 ] |
|
Another interesting clue is on the MDS if you do a for i in $(ls /proc/fs/lustre/osc/-OST/ost_conn_uuid); do cat $i; done you will see all the NIDS are exactly the same. |
| Comment by James A Simmons [ 15/Nov/12 ] |
|
Okay I just tested this again on Lustre 2.3.54 and it still exist. |
| Comment by James A Simmons [ 28/Feb/13 ] |
|
Tested this bug on Lustre 2.3.61 and the problem seems to have been fixed. |