Details
-
Technical task
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
9223372036854775807
Description
In current PCC, once a file is attached into PCC on a client, another client access even not modifying data (read operation) will trigger the data restore.
We currently use layout generation to determine whether the PCC copy on the client is still valid. But data restore will increase the layout generation. so a read operation from remote clients will invalidate the PCC copy.
This could be optimized not only for PCC but also for HSM:
- When archive a file, MDT store the data version of the file into HSM attrs;
- When archive a file again, if the data version of the file in Lustre is same as the version number in HSM attrs and the archive ID is also same, we can return immediately as HSM archive is already the latest version. HS_DIRTY may also help to do this determine, but it is a little lagging.
- Upon data restore for a HSM archived or PCC cached file, we could reset the data version in HSM attrs with the same value of data version of the file already restored into Lustre OSTs.
- So when re-attach the file once attached into PCC on the client, we could check whether the data version storing on HSM attrs on MDT is same as the data version of the file in Lustre. If it is consistent, we don't need to the data copy.