Details
-
Technical task
-
Resolution: Fixed
-
Critical
-
Lustre 2.7.0
-
7020
Description
Record linkEA verification history in RAM
To know which linkEA entries on the object_A have been verified, LFSCK must pin object_A in RAM and record the linkEA entries verification history. To avoid exhausting available memory, not all objects are pinned in RAM. LFSCK permanently pins the object in RAM only for the first of verified link V and number of hard links 'N' or linkEA entries 'L' is more than one, (V == 1) && (N > 1 || L > 1). Consider the following cases:
L > 1 || N > 1
LFSCK treats the linkEA entry as unverified as the in-RAM verification history is absent. It is necessary to pin the object in RAM for this case until all of the linkEA entries are verified.
L == 1 && N == 1
Typically, this is for singly-linked object. If LFSCK finds the directory entry pointing to the object_A that matches the unique linkEA entry, then processing is complete. Otherwise if a name entry pointing to the object_A does not match the unique linkEA entry, then a new linkEA entry will be added, and 'L' will increase ('N' will not increase, become the case 1). object_A and its linkEA verification history will be pinned in RAM.
It is possible that, the first found name entry matches the unique linkEA entry, then L == V == N == 1, we neither record the object in the lfsck_namespace or pin the object in RAM, but as the LFSCK scanning, more name entries pointing to the same object may be found, at that time, with those new linkEA entries added, the object will be pinned in RAM and recorded in the lfsck_namespace file, and will be double scanned later. For a large system, this kind "upgrading" is very rare. We prefer to double scan these objects instead of pinning most unnecessary objects in RAM.
L == 0
It is usually for IGIF object. When new linkEA entries are added, it becomes the case 2 or the case 1.
If too many objects are pinned in RAM, it may cause server memory pressure. To avoid exhausting memory, LFSCK needs to unpin objects from RAM. The following conditions to un-pinning are applied:
L == V
All the known linkEA entries on the object are valid. Although there may be other directory entries pointing to the object will be found as the LFSCK scanning. It is unnecessary to maintain the linkEA entries verification history, instead, add some on-disk flag VERIFIED on the object in the lfsck_namespace file. If more directory entries pointing to the object are found, the LFSCK can detect this flag and just adding new linkEA entries without maintaining the verification history.
Memory pressure
All the objects with L == V have been unpinned from RAM but there still is memory pressure. LFSCK will unpin some half-verified objects from RAM. Since these objects have been stored in the lfsck_namespace when they pinned in RAM, the possible invalid linkEA entries on these unpinned half-processed objects can be handled during the double scan.
Attachments
Issue Links
- is related to
-
LU-6321 Clean downgrade from 2.7.0 to 2.6.0 failed: fail to init namespace LFSCK component: rc = -5
- Resolved