[LU-5802] LFSCK 5: avoid the (direct) interaction between MDD and LFSCK under the case of insufficient space to hold all linkEA entries Created: 24/Oct/14  Updated: 30/Jan/22  Resolved: 30/Jan/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Improvement Priority: Major
Reporter: nasf (Inactive) Assignee: nasf (Inactive)
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Related
is related to LU-8569 Sharded DNE directory full of files t... Resolved
Severity: 3
Rank (Obsolete): 16269

 Description   

For a multiple-linked MDT-object, its extended attribute space is limited. As making more hard links to the MDT-object, more and more linkEA entries will be added into its linkEA until there is not enough space to hole the new linkEA entries. Under such case, only the "nlink" attribute will be increased when makes new hard link(s) to the MDT-oject.

Namespace LFSCK will try to check whether the linkEA matches its "nlink" attribute or not, and may update the "nlink" attribute according to the linkEA if the LFSCK is sure that the linkEA is trustable. So how to make the linkEA is trustable is very important. Generally, multiple-linked DMT-objects are rare, especially the case of the linkEA entries exceeding the linkEA space limitation is more rare. So as the namespace LFSCK first-scanning, all the known name entries will recorded in its linkEA, then during the second-stage scanning, the LFSCK can update the "nlink" attribute according to the linkEA entries count which is equal to the known name entries count.

Generally, above mechanism will work well. But there are two corner cases to be considered:
1) If the target MDT-object is a multiple-linked one, and during the first-stage scanning, the namespace LFSCK finds that it cannot record all the known name entries in the linkEA, then it should use some flag to skip the "nlink" verification against this MDT-object in the second-stage scanning. There are two possible ways for that:
1.1) Record such MDT-object's FID in the namespace LFSCK trace file.
1.2) Add some flag (such as SKIP_NLINK) in the MDT-object's linkEA header.

2) If the multiple-linked MDT-object has enough space to hold all the known name entries during the first-stage scanning when the LFSCK scans them, but its linkEA is overflow after that, then it also needs to tell the LFSCK to skip the "nlink" verification against this MDT-object in the second-stage scanning. Currently, the LFSCK uses 1.1) as the solution. The short-coming for 1.1) is that the MDD needs to talk with LFSCK to record the FID when linkEA overflow, such interaction makes the stack layer unclear. As for the solution 1.2), it seems not suitable for the case 2), because once the flag "SKIP_NLINK" is set in the linkEA header, then it cannot be removed. The LFSCK cannot know whether there will be more hard links to the target MDT-object after it scans, so even if the namespace LFSCK thought that it has known all the name entries to such MDT-object, it may be missed some, so during the second-stage scanning, when it finds the "SKIP_NLINK" flag in the linkEA header, it cannot remove it, and cannot update the "nlink" attribute according to the linkEA.

Please refer to the following link for more discussion:
http://review.whamcloud.com/#/c/11516/21



 Comments   
Comment by Alex Zhuravlev [ 24/Oct/14 ]

a lot of space is consumed by names, I think. so when LinkEA overflows we could reformat it storing only the parent FIDs. then path2fid code would need to scan the directories, of course, but this is supposed to be very rare case..

Comment by Andreas Dilger [ 25/Oct/14 ]

Storing only the FID would allow about 2x or 3x more entries to be stored in the LinkEA, but won't solve the problem. If there are a lot of links it would be better to enable the large_xattr feature and store all of the link names. ZFS also doesn't have this limitation, so I'd prefer not to change the format for only marginal gains.

Comment by Alex Zhuravlev [ 28/Oct/14 ]

another idea we've been discussing in gerrit is that LinkEA header might have a timestamp (or transno) specifying time when MDD/LFSCK couldn't add a name due to lack of space. if the header hasn't been updated during the current LFSCK run, then all the found names are listed in LinkEA and it's now safe to drop no-space flag from the header and check nlink.

Comment by nasf (Inactive) [ 01/Nov/16 ]

The issue will be resolved via the patch:
http://review.whamcloud.com/#/c/23500/

Generated at Sat Feb 10 01:54:39 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.