LFSCK 3: MDT-MDT consistency verification (LU-4788)

[LU-5506] LFSCK 3: Skip orphan objects handling for failed servers Created: 20/Aug/14  Updated: 22/Oct/14  Resolved: 22/Oct/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.7.0

Type: Technical task Priority: Major
Reporter: nasf (Inactive) Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Rank (Obsolete): 15363

 Description   

Lustre is distributed, the components belong to the same file can reside on several servers, such as the file's MDT-object and the file's name entry can reside on different MDTs, the file's OST-object is on OST but its metadata is stored on the MDT.

Such distribution caused that if the LFSCK cannot verify some component during the first-stage scanning, then when handles orphans in the second-stage scanning, it is difficult to distinguish whether the missed component is really corrupted or because of former LFSCK failure.

To avoid improper repairing under above difficult cases, the LFSCK will skip some orphans handling. The most safe way is to skip all the orphans handling if the LFSCK hit some failures during the first-stage scanning. But such playing is too safe as to the LFSCK may be un-completely always. Because it is normal that some servers (MDS or OSS) may hit failure during the LFSCK scanning.

Be as some improvement, the LFSCK can records the server failure event in the LFSCK tracing file during the first-stage scanning, and can only skip the orphans that are related with the failed the servers during the second-stage scanning.



 Comments   
Comment by nasf (Inactive) [ 20/Aug/14 ]

The patch for skipping orphan OST-objects handling only for failed OSTs:
http://review.whamcloud.com/10996

Comment by nasf (Inactive) [ 20/Aug/14 ]

The patch for skipping orphan MDT-objects handling only for failed MDTs:
http://review.whamcloud.com/#/c/11444/

Comment by nasf (Inactive) [ 24/Sep/14 ]

The patch http://review.whamcloud.com/#/c/10996/ has been landed.

Comment by nasf (Inactive) [ 22/Oct/14 ]

The http://review.whamcloud.com/#/c/11444/ has been landed.

Comment by nasf (Inactive) [ 22/Oct/14 ]

The left issue will be resolved in the LU-5786.

Generated at Sat Feb 10 01:52:04 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.