[LU-1543] Lustre Servers - MDS / OSS Died & fail over took over Created: 20/Jun/12 Updated: 10/Sep/12 Resolved: 10/Sep/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.2.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Fabio Verzelloni | Assignee: | Cliff White (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
MDS HW MDT LSI 5480 Pikes Peak OSS HW OST LSI 7900 Router nodes Clients 1 MDS + 1 fail over |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 6378 |
| Description |
|
Dear Support, As I said the file system remained up and running because the fail over servers took over, but with our old lustre configuration (version 1.8.7 – 1 Mds + 1 Mds Fail over – 4 Oss) also under huge stress and a lot of logging of slow down due to heavy IO load, the MDS or OSS didn't died. If you need access to our cluster, please let me know (fverzell@cscs.ch) so that we can organize to create an account. Right now we have also a list of ticket that might be related to each other in same aspect, that's the list: http://jira.whamcloud.com/browse/LU-1447 Regards |
| Comments |
| Comment by Peter Jones [ 20/Jun/12 ] |
|
Fabio We will definitely take an overall view of all your issues when deciding the best approach. Getting remote access to the cluster in question will undoubtedly be useful. I will contact you directly to make those arrangements Peter |
| Comment by Cliff White (Inactive) [ 20/Jun/12 ] |
|
Can we get a list of the address for the Lustre servers? |
| Comment by Liang Zhen (Inactive) [ 21/Jun/12 ] |
|
I think it could be a dup of |
| Comment by Fabio Verzelloni [ 21/Jun/12 ] |
|
The list of the Lustre servers is the following: MDS + Failover OSS + each couple is the failover (weisshorn03-04, 05-06, ecc..) |
| Comment by Cliff White (Inactive) [ 05/Jul/12 ] |
|
Has there been any word from Cray on access to gnilnd source? Should we close this issue and revisit after the software version change planned ( |
| Comment by Fabio Verzelloni [ 06/Jul/12 ] |
|
Cliff, Thanks |
| Comment by Cory Spitz [ 06/Jul/12 ] |
|
FYI, Cray is working on pushing up the gnilnd into the Lustre tree. The tracking ticket is |
| Comment by James A Simmons [ 29/Aug/12 ] |
|
Any updates? |
| Comment by Cory Spitz [ 29/Aug/12 ] |
|
Well, the Cray LND code has been pushed to |
| Comment by James A Simmons [ 29/Aug/12 ] |
|
I mean does Fabio still see the problem. |
| Comment by Cliff White (Inactive) [ 04/Sep/12 ] |
|
What is the current state? Is there anything more we can do on this issue? |
| Comment by Cliff White (Inactive) [ 10/Sep/12 ] |
|
I am going to close this issue. Please re-open if you have more information or questions. |