[LU-1267] LFSCK II: MDT-OST consistency check/repair Created: 29/Mar/12  Updated: 17/Mar/14  Due: 31/Jul/13  Resolved: 17/Mar/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: Lustre 2.6.0

Type: New Feature Priority: Critical
Reporter: nasf (Inactive) Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: LFSCK

Issue Links:
Blocker
is blocked by LU-3335 LFSCK II: MDT-OST OST local consisten... Resolved
is blocked by LU-3336 LFSCK II: MDT-OST OST orphan handling Resolved
is blocked by LU-3995 CLONE - LFSCK II: MDT-OST OST local c... Resolved
Related
is related to LUDOC-195 Complete Lustre Manual updates for LF... Resolved
is related to LUDOC-155 LFSCK Phase II Doc Changes Resolved
is related to LU-4553 LFSCK 5: LFSCK behaviour if an OST is... Open
is related to LU-14 live replacement of OST Resolved
Sub-Tasks:
Key
Summary
Type
Status
Assignee
LU-3423 Create LFSCK II Test Plan and attach ... Technical task Resolved nasf  
LU-3588 use lctl --device option to specify d... Technical task Resolved nasf  
LU-3590 Repair the file which MDT-object has ... Technical task Resolved nasf  
LU-3591 Repair unmatched referenced MDT-objec... Technical task Resolved nasf  
LU-3592 Repair multiple referenced OST-object Technical task Resolved nasf  
LU-3593 Fix inconsistent layout EA Technical task Resolved nasf  
LU-3594 Repair inconsistent file owner Technical task Resolved nasf  
LU-3595 Repair unreferenced OST-object Technical task Resolved nasf  
Story Points: 89
Severity: 3
Epic: layout
Rank (Obsolete): 4024

 Description   

In Lustre, for striped (non-zero striped) file, the layout information for each OST-object is recorded as extended attributes (XATTR_NAME_LOV) in its MDT-object on MDT. Such EA contains the OST index, OID or FID of the OST-object, and so on. On OST-side, each OST-object records the information (MDT-object FID) that indicates which file the OST object belongs to. Over the lifetime of an active filesystem, the layout information in MDT-object's EA may be inconsistent with the information on OST. There are several inconsistent cases as following:

1. OST-object1 is marked as part of file1 on MDT, but OST-object1 is unassigned or uninitialized on OST.
2. OST-object1 is marked as part of file1 on OST, but there is no record for OST-object1 on MDT.
3. OST-object1 is marked as part of file1 on OST, but MDT records that it belongs to file2.
4. OST-object1 is marked as part of file1 on OST, but both file1 and file2 on MDT claim that it owns OST-object1.

These inconsistent cases will misguide client/MDS when access related OST-objects, waste space, lose data, even destroy all the system. In LFSCK phase II, we will implement an online tool to check and repair the file layout inconsistency. Such tool will use the inode iterator implemented in LFSCK phase I to scan the whole system, and can be driven together with other LFSCK components.

On the other hand, the owner information, UID and GID of OST-objects will also be verified to match that of the MDT-object, the inconsistent ones will be fixed, to ensure correct quota allocation.



 Comments   
Comment by nasf (Inactive) [ 29/Dec/12 ]

This task will be postponed until DNE I and LFSCK 1.5 completed.

Comment by nasf (Inactive) [ 27/Apr/13 ]

The task is restarted.

Comment by nasf (Inactive) [ 02/Sep/13 ]

All the LFSCK phase II patches have been landed to master as following:

1) http://review.whamcloud.com/#/c/7145/
2) http://review.whamcloud.com/#/c/7053/
3) http://review.whamcloud.com/#/c/8002/
4) http://review.whamcloud.com/#/c/7146/
5) http://review.whamcloud.com/#/c/6997/
6) http://review.whamcloud.com/#/c/8302/
7) http://review.whamcloud.com/#/c/7666/
8) http://review.whamcloud.com/#/c/7062/
9) http://review.whamcloud.com/#/c/8623/
10) http://review.whamcloud.com/#/c/7087/
11) http://review.whamcloud.com/#/c/7108/
12) http://review.whamcloud.com/#/c/7665/
13) http://review.whamcloud.com/#/c/7156/
14) http://review.whamcloud.com/#/c/9186/
15) http://review.whamcloud.com/#/c/7456/
16) http://review.whamcloud.com/#/c/7517/
17) http://review.whamcloud.com/#/c/7519/
18) http://review.whamcloud.com/#/c/7524/
19) http://review.whamcloud.com/#/c/7743/
20) http://review.whamcloud.com/#/c/9257/
21) http://review.whamcloud.com/#/c/8303/
22) http://review.whamcloud.com/#/c/7810/
23) http://review.whamcloud.com/#/c/8305/
24) http://review.whamcloud.com/#/c/7811/
25) http://review.whamcloud.com/#/c/8694/
26) http://review.whamcloud.com/#/c/7667/

Comment by Lai Siyao [ 30/Dec/13 ]

I have several concerns in the design for LWP and LFCK repair threads:
1. IMHO LWP should be a full functional device, other than a place to store connection only. And the difference between LWP and OSP should be that LWP doesn't support recovery, so the use of this device will be simpler. And there should be a similar device LWD like LOD.
2. in current code there are master and assistance LFSCK threads to cooperate on LFSCK, however with more component added, eg. DNE consistency, the master threads needs to communicate with several assistance threads. This is complicated and error-prone, instead we should avoid inter-process communications, so we can do it like ptlrpcd/ldlm threads: spawn worker threads pool (new threads can be created if necessary at runtime), and these worker threads pick and finish their assignment separately.
3. statahead proves working on client side, and I don't see why it can't be done by lfsck as well, and if it can support bulk statahead, this should be quite efficient.

Comment by Jodi Levi (Inactive) [ 17/Mar/14 ]

All patches landed to Master.

Generated at Sat Feb 10 01:15:06 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.