[LU-11379] HSM Copytool Performance Improvements Created: 14/Sep/18 Updated: 14/Jul/21 Resolved: 14/Jul/21 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | John Hammond | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | HSM | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
| Comments |
| Comment by Andreas Dilger [ 14/Sep/18 ] |
Presumably the "0001" is based on OID % 0xffff? Is there a desire to have some temporal locality with objects in the archive, rather than all 65k directories being used continually? One drawback of using the same directories forever is that they get relatively large and fragmented in the hash space, and are updated totally randomly on disk. For MDS->OST object allocation, I've thought about using something like SEQ>>8/OID>>16 or just stick with SEQ/d(OID % 32) but limit OIDs-per-SEQ to 1M or so, which will put concurrently allocated objects relatively close together, but slowly move into new upper-level directories over time. The premise is that concurrently allocated objects are more likely to also be accessed and deleted together, so we can slowly drop those older directories from RAM, and eventually shrink them down as they become empty. Having a single huge directory with all ages of files means the whole directory needs to live in RAM, or be IOPS bound during modification since each insert/delete/lookup will modify a different leaf block.
One reason for storing the LOV EA in the archive is for disaster recovery/rehydrate. If we instead make full backups of the MDT(s) and/or subtrees, then we don't need the layouts or other xattrs in the archive. Conversely, storing xattrs (more than just lov) with the files makes them more "self contained" and usable even if the MDT backup is unavailable. |
| Comment by Andreas Dilger [ 14/Sep/18 ] |
|
PS - you are unlikely to get even OID distribution across [0x0000-0xffff]. It is much more likely to get low-numbered OIDs, since clients always start with a new SEQ after mount and will allocate an OID=1 file, but won't necessarily allocate OID=65535 or OID=131071 ever before restart. That will put more pressure on the low-numbered directories and may cause problems in some archives. |
| Comment by John Hammond [ 17/Sep/18 ] |
|
> One reason for storing the LOV EA in the archive is for disaster recovery/rehydrate. If we instead make full backups of the MDT(s) and/or subtrees, then we don't need the layouts or other xattrs in the archive. Conversely, storing xattrs (more than just lov) with the files makes them more "self contained" and usable even if the MDT backup is unavailable. HSM (alone) is not a disaster recovery solution. It's a storage tiering mechanism. It LOV is important then either keep full MDT images or store copies of LOV in the policy tool DB. |
| Comment by John Hammond [ 14/Jul/21 ] |
|
A partial patch for this was pushed for review under
Partial patch. Signed-off-by: John L. Hammond <jhammond@whamcloud.com> |
| Comment by John Hammond [ 14/Jul/21 ] |
|
Archive flattening is in progress under LU-144359. https://review.whamcloud.com/41312 LU-14359 hsm: support a flatter HSM archive format |
| Comment by John Hammond [ 14/Jul/21 ] |
|
Partially fixed. Partially abandoned. |