PCC Phase 2 (LU-12714)

[LU-13032] Add lctl cleanup|uncache|revalidate commands for PCC Created: 29/Nov/19  Updated: 29/Nov/19

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Technical task Priority: Minor
Reporter: Qian Yingjin Assignee: Qian Yingjin
Resolution: Unresolved Votes: 0
Labels: None

Rank (Obsolete): 9223372036854775807

 Description   

If there is a "kept" file on PCC, but the file was modified in Lustre after data restore, then we need to ensure that the stale PCC copy is removed from cache.

Usually there is a daemon running on the PCC client, monitoring the space usage of the PCC device, scanning the PCC device, doing some actions accordingly, which can be used to remove this kind of PCC copies.
We could add some lctl pcc commands or llapi interface as follows:

  1. lctl pcc clean $MNTPT $PCCPATH
    The command above can be used to clean up the stale invalid PCC copies out from PCC to free up space.
  2. lctl pcc uncache $MNTPT $PCCPATH
    This command will restore all data back to Lustre OSTs, and then remove the PCC copies, similar with lctl pcc del, but does not delete the PCC backend from the client.
  3. lctl pcc revalidate $MNTPT $PCCPATH
    This command will try to attach the PCC copies again if it is still valid.
    First, if the Layout generation is consistent, we can attach it directly;
    Otherwise, compare the data version between the value in HSM attrs and the one of the file in Lustre, if they are same, we can also revalidate the PCC cache.


 Comments   
Comment by Andreas Dilger [ 29/Nov/19 ]

wasn't there already a patch in the PCC branch that did this - sync the PCC cache with Lustre at unmount time? I think that adding commands to do this manually might help, but I think it is more important that this be handled as automatically as much as possible. Options include to resync idle PCC files to Lustre periodically so that they can be released from PCC quickly if needed, and to reduce the delay at unmount time.

Comment by Qian Yingjin [ 29/Nov/19 ]

No. We do not have such a patch in the PCC branch to sync the PCC cache with Lustre at unmount time.
As these commands above are all implemented in the user space, there are some complex to sync at unmount time in the kernel...
I agree that we should add an option to determine whether to sync the PCC cache at unmount time for better support for disconnected operation with WBC on PCC, i.e. dedicated for a mobile device which may go offline manually.

Comment by Andreas Dilger [ 29/Nov/19 ]

I was thinking about patch https://review.whamcloud.com/35230 "LU-12373 pcc: uncache the pcc copies when remove a PCC backend" to prevent the PCC cache filesystem from holding dirty files.

I think we are still a long way away from disconnected client operations like CODA/Intermezzo. I'm not against that at some point in the future (I actually worked on Intermezzo to have disconnected clients at the same time I first worked on Lustre) but we have to have our cache file management/resync much better than it is today before this would be practical to deploy.

Generated at Sat Feb 10 02:57:46 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.