[LU-13833] hook llite to inode cache shrinker - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: Upstream
Affects Version/s: None
Labels:
- upstream

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

In upstream kernel commit v3.0-rc7-226-g0e1fdafd9398, the filesystem can implement icache/dcache shrinkers on a per-superblock basis:

commit 0e1fdafd93980eac62e778798549ce0f6073905c
Author:     Dave Chinner <dchinner@redhat.com>
AuthorDate: Fri Jul 8 14:14:44 2011 +1000
Commit:     Al Viro <viro@zeniv.linux.org.uk>
CommitDate: Wed Jul 20 20:47:41 2011 -0400

    superblock: add filesystem shrinker operations
    
    Now we have a per-superblock shrinker implementation, we can add a
    filesystem specific callout to it to allow filesystem internal
    caches to be shrunk by the superblock shrinker.
    
    Rather than perpetuate the multipurpose shrinker callback API (i.e.
    nr_to_scan == 0 meaning "tell me how many objects freeable in the
    cache), two operations will be added. The first will return the
    number of objects that are freeable, the second is the actual
    shrinker call.
    
    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

It would be useful to hook the client dcache/icache/LDLM cache into these shrinker callbacks so that it can more effectively manage cache sizes on clients.

Attachments

Issue Links

is related to

LU-14408 very large lustre_inode_cache

Open

LU-13983 rmdir should release inode on Lustre client

Resolved

LU-18175 Enhance Lustre to exploit new shrinker handling

In Progress

LU-13970 add an option to disable inode cache on Lustre client

Resolved

is related to

LU-12511 Prepare lustre for adoption into the linux kernel

Open

Activity

[LU-13833] hook llite to inode cache shrinker

Andreas Dilger added a comment - 10/Nov/21 12:58 AM

To fix this problem properly, it may be enough to check here if there is any DLM lock on the inode, and drop the inode if not?

Something like the following in ll_drop_inode():

        if (!md_lock_match(ll_i2mdexp(inode), ..., ll_i2fid(inode), ...))
                return 1;

The main question is which lock mode/bits/flags to be used for the matched? The inode/layout lock should not be dropped if it is being used for IO or has dirty data (that is already impossible in an active syscall, but maybe between syscalls), but should be dropped when there are no more DLM locks (MDC or OSC) using the inode, since any future access will need an MDS RPC to revalidate anyway.

A possibly more efficient option would be to add a refcount to ll_inode_info with the number of DLM locks attached to it, so that it can be checked more efficiently instead of a lock match. The refcount could be added to the inode after lli_inode_magic, since there is a 4-byte hole.

The DLM locks shouldn't (I think?) __iget() the VFS inode itself for each lock to avoid a circular dependency where inodes cannot be dropped from cache when they have any DLM locks, since that may pin a lot of inodes. The counter argument would that the DLM locks would eventually be dropped from cache themselves (LRU, slab shrinker), so the inode refcount would be finite, but adds some interdependency.

Andreas Dilger added a comment - 10/Nov/21 12:58 AM To fix this problem properly, it may be enough to check here if there is any DLM lock on the inode, and drop the inode if not? Something like the following in ll_drop_inode() : if (!md_lock_match(ll_i2mdexp(inode), ..., ll_i2fid(inode), ...)) return 1; The main question is which lock mode/bits/flags to be used for the matched? The inode/layout lock should not be dropped if it is being used for IO or has dirty data (that is already impossible in an active syscall, but maybe between syscalls), but should be dropped when there are no more DLM locks (MDC or OSC) using the inode, since any future access will need an MDS RPC to revalidate anyway. A possibly more efficient option would be to add a refcount to ll_inode_info with the number of DLM locks attached to it, so that it can be checked more efficiently instead of a lock match. The refcount could be added to the inode after lli_inode_magic, since there is a 4-byte hole. The DLM locks shouldn't (I think?) __iget() the VFS inode itself for each lock to avoid a circular dependency where inodes cannot be dropped from cache when they have any DLM locks, since that may pin a lot of inodes. The counter argument would that the DLM locks would eventually be dropped from cache themselves (LRU, slab shrinker), so the inode refcount would be finite, but adds some interdependency.

hook llite to inode cache shrinker

Details

Description

Attachments

Issue Links

Activity

People

Dates