Metadata writeback cache support (LU-10938)

[LU-13044] WBC3: remove the whole subtree on MDT already deleted in the client WBC cache Created: 04/Dec/19  Updated: 10/Jan/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Technical task Priority: Minor
Reporter: Qian Yingjin Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
Rank (Obsolete): 9223372036854775807

 Description   

In LU-13021, we have designed three flush mode for WBC.

In WBC_FLUSH_AGE_LOCK_HOLD flush mode, to optimize the unlink() operation, we can also do unlink via background flush ->write_inode(). The design could be as follows:
When unlink 'f' under the directory 'dir' ('f' and 'dir' are all protected under a root WBC EX lock):

  • Each directory dentry has a list L to maintain its children files or directories which have already flushed to MDT, but removed from WBC cache (MemFS) later;
  • If 'f' is not flushed to MDT (!Sync(S) state), remove it directly from cache (MemFS currently);
  • Otherwise, add a item which contains the name of the unlinking file into L of 'dir'; remove it from cache; And then mark 'dir' inode as dirty which will be flushed later;
  • When Linux kernel flushes an inode via ->write_inode(), if found that the directory 'dir' has some children files or directories which are already synced to MDT but unlinked locally in cache, it must do a unlink for these files or directories on MDT;
  • When MDT received an unlink request, if found it is not an empty directory (nlink > 0?), it must remove this whole subtree. This can be done Asynchronously:
    • move this directory into lost+found?, and then reply to the client;
    • launch a daemon thread on MDT dedicate to unlink this kind of directories under lost+found.


 Comments   
Comment by Andreas Dilger [ 04/Dec/19 ]

Another option would be to send one or more MDS_RMFID RPC with the FIDs of the files in the tree, starting at the bottom. One thing to be careful of is that RMFID will delete all links to the file, so we would need a slight modification to allow removing just some links.

Comment by Qian Yingjin [ 04/Dec/19 ]

Could MDS_RMFID remove a whole subtree on MDT?

For example, when flush a dirty directory /mnt/lustre/wbc/batch:

/mnt/lustre/wbc/batch/dir1/a

/mnt/lustre/wbc/batch/dir1/b

/mnt/lustre/wbc/batch/dir1/c

/mnt/lustre/wbc/batch/dir2/a

/mnt/lustre/wbc/batch/dir1/dir3/dir4/bb

...

/mnt/lustre/wbc/batch/f1

/mnt/lustre/wbc/batch/f3

these files are all removed from client WBC cache, but already flushed to MDT.

At this time, it only needs to send unlink requests for its children files or directories:

/mnt/lustre/wbc/batch/dir1

/mnt/lustre/wbc/batch/dir2

/mnt/lustre/wbc/batch/f1

/mnt/lustre/wbc/batch/f3

 

It does not need to start at the bottom for the best of the optimization.

 

 

 

Comment by Andreas Dilger [ 04/Dec/19 ]

The reason I suggest it needs to start at the bottom is that if removing a directory it would be best if the directory is already empty. Otherwise, there is a real danger if RMFID is allowed to remove the whole directory of entries. In general, the ability to remove a whole directory recursively is dangerous and should be restricted as much as possible.

Comment by Andreas Dilger [ 04/Dec/19 ]

I think that having a dedicated "rm -r" optimization is a much lower priority than the batched RPCs support in LU-13045. Once we have batched RPCs then we will already be able to unlink files very quickly over the network. It also isn't clear that having the EX lock on an existing directory necessarily means that this client is the only one accessing the subtree.

Comment by Gerrit Updater [ 23/Nov/21 ]

"Yingjin Qian <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/45640
Subject: LU-13044 wbc: delay file removal for keep flush mode
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8e3f2c7731f10903359acb1140d282c5f37b395f

Comment by Gerrit Updater [ 23/Nov/21 ]

"Yingjin Qian <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/45642
Subject: LU-13044 wbc: subtree removal for keep flush mode
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 67060cc94134971e37dce2ca82c404cb76aad209

Comment by Gerrit Updater [ 23/Nov/21 ]

"Yingjin Qian <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/45643
Subject: LU-13044 wbc: async subtree removal for keep flush mode
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 17ea03301e87f3192e51786a0ae5464c0bab00ab

Generated at Sat Feb 10 02:57:53 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.