Description
In upstream kernel commit v3.0-rc7-226-g0e1fdafd9398, the filesystem can implement icache/dcache shrinkers on a per-superblock basis:
commit 0e1fdafd93980eac62e778798549ce0f6073905c Author: Dave Chinner <dchinner@redhat.com> AuthorDate: Fri Jul 8 14:14:44 2011 +1000 Commit: Al Viro <viro@zeniv.linux.org.uk> CommitDate: Wed Jul 20 20:47:41 2011 -0400 superblock: add filesystem shrinker operations Now we have a per-superblock shrinker implementation, we can add a filesystem specific callout to it to allow filesystem internal caches to be shrunk by the superblock shrinker. Rather than perpetuate the multipurpose shrinker callback API (i.e. nr_to_scan == 0 meaning "tell me how many objects freeable in the cache), two operations will be added. The first will return the number of objects that are freeable, the second is the actual shrinker call. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
It would be useful to hook the client dcache/icache/LDLM cache into these shrinker callbacks so that it can more effectively manage cache sizes on clients.
Attachments
Issue Links
- is related to
-
LU-14408 very large lustre_inode_cache
-
- Open
-
-
LU-13983 rmdir should release inode on Lustre client
-
- Resolved
-
-
LU-18175 Enhance Lustre to exploit new shrinker handling
-
- In Progress
-
-
LU-13970 add an option to disable inode cache on Lustre client
-
- Resolved
-
- is related to
-
LU-12511 Prepare lustre for adoption into the linux kernel
-
- Open
-
To fix this problem properly, it may be enough to check here if there is any DLM lock on the inode, and drop the inode if not?
Something like the following in ll_drop_inode():
The main question is which lock mode/bits/flags to be used for the matched? The inode/layout lock should not be dropped if it is being used for IO or has dirty data (that is already impossible in an active syscall, but maybe between syscalls), but should be dropped when there are no more DLM locks (MDC or OSC) using the inode, since any future access will need an MDS RPC to revalidate anyway.
A possibly more efficient option would be to add a refcount to ll_inode_info with the number of DLM locks attached to it, so that it can be checked more efficiently instead of a lock match. The refcount could be added to the inode after lli_inode_magic, since there is a 4-byte hole.
The DLM locks shouldn't (I think?) __iget() the VFS inode itself for each lock to avoid a circular dependency where inodes cannot be dropped from cache when they have any DLM locks, since that may pin a lot of inodes. The counter argument would that the DLM locks would eventually be dropped from cache themselves (LRU, slab shrinker), so the inode refcount would be finite, but adds some interdependency.