[LU-12754] ALL OST lost contact with OSS after disk failures Created: 12/Sep/19  Updated: 12/Sep/19

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Question/Request Priority: Critical
Reporter: Helen Wang Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

centos 7.2


Attachments: File dmesg.save     HTML File messages    
Epic/Theme: 2.7, lustre
Rank (Obsolete): 9223372036854775807

 Description   

I am writing to ask if you can help our group on this emergency issue with our lustre system. the system is running centos 7.2 with lustre 2.7.

There are 2 OSS(oss1 and oss2) and two MDS ( mds1 and mds2) running as failover servers, on oss1 and oss2, after reporting disk errors on both oss1 and oss2, I managed to reboot it and both lost contact with all OSTs!!

I'd like to ask your advice on how to recover it, we have over 400TB data and desperately need it back .


Generated at Sat Feb 10 02:55:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.