1. Flatten archive hierarchy to 1 directory. /arc1/0001/0000/0401/0000/0002/0000/0x200000401:0x1:0x0 becomes /arc1/0001/0x200000401:0x1:0x0.
Presumably the "0001" is based on OID % 0xffff? Is there a desire to have some temporal locality with objects in the archive, rather than all 65k directories being used continually?
One drawback of using the same directories forever is that they get relatively large and fragmented in the hash space, and are updated totally randomly on disk. For MDS->OST object allocation, I've thought about using something like SEQ>>8/OID>>16 or just stick with SEQ/d(OID % 32) but limit OIDs-per-SEQ to 1M or so, which will put concurrently allocated objects relatively close together, but slowly move into new upper-level directories over time. The premise is that concurrently allocated objects are more likely to also be accessed and deleted together, so we can slowly drop those older directories from RAM, and eventually shrink them down as they become empty. Having a single huge directory with all ages of files means the whole directory needs to live in RAM, or be IOPS bound during modification since each insert/delete/lookup will modify a different leaf block.
3. Stop storing and loading _lov files. For restore, we are already getting file attributes+striping from the MDT (see ct_md_getattr()). But we only use the stat portion of the lmd.
One reason for storing the LOV EA in the archive is for disaster recovery/rehydrate. If we instead make full backups of the MDT(s) and/or subtrees, then we don't need the layouts or other xattrs in the archive. Conversely, storing xattrs (more than just lov) with the files makes them more "self contained" and usable even if the MDT backup is unavailable.
Partially fixed. Partially abandoned.