[LU-11379] HSM Copytool Performance Improvements - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Labels:
- HSM

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

Flatten archive hierarchy to 1 directory. /arc1/0001/0000/0401/0000/0002/0000/0x200000401:0x1:0x0 becomes /arc1/0001/0x200000401:0x1:0x0.
Remove/deprecate shadow tree handling.
Stop storing and loading _lov files. For restore, we are already getting file attributes+striping from the MDT (see ct_md_getattr()). But we only use the stat portion of the lmd.
Improve thread handling.

Attachments

Issue Links

is related to

LU-11380 IOC_MDC_GETFILEINFO returns garbage stripe info for files with long names but no striping

Closed

Activity

[LU-11379] HSM Copytool Performance Improvements

John Hammond added a comment - 14/Jul/21 11:53 AM

Partially fixed. Partially abandoned.

John Hammond added a comment - 14/Jul/21 11:53 AM Partially fixed. Partially abandoned.

John Hammond added a comment - 14/Jul/21 11:53 AM

Archive flattening is in progress under LU-144359.

https://review.whamcloud.com/41312 LU-14359 hsm: support a flatter HSM archive format
https://review.whamcloud.com/41366 LU-14359 hsm: support shadow tree in archive upgrade

John Hammond added a comment - 14/Jul/21 11:53 AM Archive flattening is in progress under LU-144359. https://review.whamcloud.com/41312 LU-14359 hsm: support a flatter HSM archive format https://review.whamcloud.com/41366 LU-14359 hsm: support shadow tree in archive upgrade

John Hammond added a comment - 14/Jul/21 11:51 AM

A partial patch for this was pushed for review under ~~LU-11380~~. See https://review.whamcloud.com/#/c/33215/.

~~LU-11380~~ hsm: streamline copytool restore handling

Partial patch.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I7435766f6f67ba60ac39bf02bc3232316a830387

John Hammond added a comment - 14/Jul/21 11:51 AM A partial patch for this was pushed for review under LU-11380 . See https://review.whamcloud.com/#/c/33215/ . LU-11380 hsm: streamline copytool restore handling Partial patch. Signed-off-by: John L. Hammond <jhammond@whamcloud.com> Change-Id: I7435766f6f67ba60ac39bf02bc3232316a830387

John Hammond added a comment - 17/Sep/18 1:47 PM

> One reason for storing the LOV EA in the archive is for disaster recovery/rehydrate. If we instead make full backups of the MDT(s) and/or subtrees, then we don't need the layouts or other xattrs in the archive. Conversely, storing xattrs (more than just lov) with the files makes them more "self contained" and usable even if the MDT backup is unavailable.

HSM (alone) is not a disaster recovery solution. It's a storage tiering mechanism. It LOV is important then either keep full MDT images or store copies of LOV in the policy tool DB.

John Hammond added a comment - 17/Sep/18 1:47 PM > One reason for storing the LOV EA in the archive is for disaster recovery/rehydrate. If we instead make full backups of the MDT(s) and/or subtrees, then we don't need the layouts or other xattrs in the archive. Conversely, storing xattrs (more than just lov) with the files makes them more "self contained" and usable even if the MDT backup is unavailable. HSM (alone) is not a disaster recovery solution. It's a storage tiering mechanism. It LOV is important then either keep full MDT images or store copies of LOV in the policy tool DB.

Andreas Dilger added a comment - 14/Sep/18 10:46 PM

PS - you are unlikely to get even OID distribution across [0x0000-0xffff]. It is much more likely to get low-numbered OIDs, since clients always start with a new SEQ after mount and will allocate an OID=1 file, but won't necessarily allocate OID=65535 or OID=131071 ever before restart. That will put more pressure on the low-numbered directories and may cause problems in some archives.

Andreas Dilger added a comment - 14/Sep/18 10:46 PM PS - you are unlikely to get even OID distribution across [0x0000-0xffff] . It is much more likely to get low-numbered OIDs, since clients always start with a new SEQ after mount and will allocate an OID=1 file, but won't necessarily allocate OID=65535 or OID=131071 ever before restart. That will put more pressure on the low-numbered directories and may cause problems in some archives.

Andreas Dilger added a comment - 14/Sep/18 10:40 PM

1. Flatten archive hierarchy to 1 directory. /arc1/0001/0000/0401/0000/0002/0000/0x200000401:0x1:0x0 becomes /arc1/0001/0x200000401:0x1:0x0.

Presumably the "0001" is based on OID % 0xffff? Is there a desire to have some temporal locality with objects in the archive, rather than all 65k directories being used continually?

One drawback of using the same directories forever is that they get relatively large and fragmented in the hash space, and are updated totally randomly on disk. For MDS->OST object allocation, I've thought about using something like SEQ>>8/OID>>16 or just stick with SEQ/d(OID % 32) but limit OIDs-per-SEQ to 1M or so, which will put concurrently allocated objects relatively close together, but slowly move into new upper-level directories over time. The premise is that concurrently allocated objects are more likely to also be accessed and deleted together, so we can slowly drop those older directories from RAM, and eventually shrink them down as they become empty. Having a single huge directory with all ages of files means the whole directory needs to live in RAM, or be IOPS bound during modification since each insert/delete/lookup will modify a different leaf block.

3. Stop storing and loading _lov files. For restore, we are already getting file attributes+striping from the MDT (see ct_md_getattr()). But we only use the stat portion of the lmd.

One reason for storing the LOV EA in the archive is for disaster recovery/rehydrate. If we instead make full backups of the MDT(s) and/or subtrees, then we don't need the layouts or other xattrs in the archive. Conversely, storing xattrs (more than just lov) with the files makes them more "self contained" and usable even if the MDT backup is unavailable.

Andreas Dilger added a comment - 14/Sep/18 10:40 PM 1. Flatten archive hierarchy to 1 directory. /arc1/0001/0000/0401/0000/0002/0000/0x200000401:0x1:0x0 becomes /arc1/0001/0x200000401:0x1:0x0 . Presumably the "0001" is based on OID % 0xffff ? Is there a desire to have some temporal locality with objects in the archive, rather than all 65k directories being used continually? One drawback of using the same directories forever is that they get relatively large and fragmented in the hash space, and are updated totally randomly on disk. For MDS->OST object allocation, I've thought about using something like SEQ>>8/OID>>16 or just stick with SEQ/d(OID % 32) but limit OIDs-per-SEQ to 1M or so, which will put concurrently allocated objects relatively close together, but slowly move into new upper-level directories over time. The premise is that concurrently allocated objects are more likely to also be accessed and deleted together, so we can slowly drop those older directories from RAM, and eventually shrink them down as they become empty. Having a single huge directory with all ages of files means the whole directory needs to live in RAM, or be IOPS bound during modification since each insert/delete/lookup will modify a different leaf block. 3. Stop storing and loading _lov files. For restore, we are already getting file attributes+striping from the MDT (see ct_md_getattr() ). But we only use the stat portion of the lmd. One reason for storing the LOV EA in the archive is for disaster recovery/rehydrate. If we instead make full backups of the MDT(s) and/or subtrees, then we don't need the layouts or other xattrs in the archive. Conversely, storing xattrs (more than just lov ) with the files makes them more "self contained" and usable even if the MDT backup is unavailable.

People

Assignee:: WC Triage

Reporter:: John Hammond

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 14/Sep/18 7:35 PM

Updated:: 14/Jul/21 11:53 AM

Resolved:: 14/Jul/21 11:53 AM