[LU-5802] LFSCK 5: avoid the (direct) interaction between MDD and LFSCK under the case of insufficient space to hold all linkEA entries - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Duplicate
Priority: Major
Fix Version/s: None
Affects Version/s: Lustre 2.7.0
Labels:
None

Severity:
3
Rank (Obsolete):
16269

Description

For a multiple-linked MDT-object, its extended attribute space is limited. As making more hard links to the MDT-object, more and more linkEA entries will be added into its linkEA until there is not enough space to hole the new linkEA entries. Under such case, only the "nlink" attribute will be increased when makes new hard link(s) to the MDT-oject.

Namespace LFSCK will try to check whether the linkEA matches its "nlink" attribute or not, and may update the "nlink" attribute according to the linkEA if the LFSCK is sure that the linkEA is trustable. So how to make the linkEA is trustable is very important. Generally, multiple-linked DMT-objects are rare, especially the case of the linkEA entries exceeding the linkEA space limitation is more rare. So as the namespace LFSCK first-scanning, all the known name entries will recorded in its linkEA, then during the second-stage scanning, the LFSCK can update the "nlink" attribute according to the linkEA entries count which is equal to the known name entries count.

Generally, above mechanism will work well. But there are two corner cases to be considered:
1) If the target MDT-object is a multiple-linked one, and during the first-stage scanning, the namespace LFSCK finds that it cannot record all the known name entries in the linkEA, then it should use some flag to skip the "nlink" verification against this MDT-object in the second-stage scanning. There are two possible ways for that:
1.1) Record such MDT-object's FID in the namespace LFSCK trace file.
1.2) Add some flag (such as SKIP_NLINK) in the MDT-object's linkEA header.

2) If the multiple-linked MDT-object has enough space to hold all the known name entries during the first-stage scanning when the LFSCK scans them, but its linkEA is overflow after that, then it also needs to tell the LFSCK to skip the "nlink" verification against this MDT-object in the second-stage scanning. Currently, the LFSCK uses 1.1) as the solution. The short-coming for 1.1) is that the MDD needs to talk with LFSCK to record the FID when linkEA overflow, such interaction makes the stack layer unclear. As for the solution 1.2), it seems not suitable for the case 2), because once the flag "SKIP_NLINK" is set in the linkEA header, then it cannot be removed. The LFSCK cannot know whether there will be more hard links to the target MDT-object after it scans, so even if the namespace LFSCK thought that it has known all the name entries to such MDT-object, it may be missed some, so during the second-stage scanning, when it finds the "SKIP_NLINK" flag in the linkEA header, it cannot remove it, and cannot update the "nlink" attribute according to the linkEA.

Please refer to the following link for more discussion:
http://review.whamcloud.com/#/c/11516/21

Attachments

Issue Links

is related to

LU-8569 Sharded DNE directory full of files that don't exist

Resolved

Activity

People

Assignee:: nasf (Inactive)

Reporter:: nasf (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 24/Oct/14 12:25 AM

Updated:: 30/Jan/22 9:50 AM

Resolved:: 30/Jan/22 9:50 AM