[LU-903] Race condition while get_attr after cancel_lru_locks and sysctl drop_caches - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.4.0
Affects Version/s: Lustre 2.4.0, Lustre 1.8.6
Labels:
- dentry
- patch
Environment:
SLES11

Severity:
3
Bugzilla ID:
24,555
Epic:
- metadata
Rank (Obsolete):
5140

Description

Reproduction script is described in https://bugzilla.lustre.org/show_bug.cgi?id=24555
After some analysis the next bug picture drown:

1.
First thread makes lookup. Gets CR lock and terminates.
After that another thread a) make clear_lru cache b) sysctl that flush slab and another kernel
caches (dcache, icache, etc ...)

This results that sequence "shrink_dcache_memory -> "foreach_dentry_lru" -> prune_one_dentry ->
d_kill ~~> d_iput()" is executed. After that ll_clear_inode executed that NULLed lock~~>l_ast_data.

2.
Some time after 1. another thread make get_attr on same inode. Gets another IBIT lock, but LOOKUP +
UPDATE.
Another client need cancel this lock, but 2 BL AST race arised. Second lock can't cancel first
lock because optimization, that was sown bellow and fist lock can't be canceled because its
inode == NULL.

Optimisation:

int ll_mdc_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
                        void *data, int flag)
...
               if ((bits & MDS_INODELOCK_LOOKUP) &&
                    ll_have_md_lock(inode, MDS_INODELOCK_LOOKUP, LCK_MINMODE))
                        bits &= ~MDS_INODELOCK_LOOKUP;
                if ((bits & MDS_INODELOCK_UPDATE) &&
                    ll_have_md_lock(inode, MDS_INODELOCK_UPDATE, LCK_MINMODE))
                        bits &= ~MDS_INODELOCK_UPDATE;
                if ((bits & MDS_INODELOCK_OPEN) &&
                    ll_have_md_lock(inode, MDS_INODELOCK_OPEN, mode))
                        bits &= ~MDS_INODELOCK_OPEN;
...
if (inode->i_sb->s_root &&
        inode != inode->i_sb->s_root->d_inode &&
        (bits & MDS_INODELOCK_LOOKUP))
        ll_unhash_aliases(inode);
iput(inode);

Attachments

Issue Links

Trackbacks

Lustre 1.8.x known issues tracker While testing against Lustre b18 branch, we would hit known bugs which were already reported in Lustre Bugzilla https://bugzilla.lustre.org/. In order to move away from relying on Bugzilla, we would create a JIRA

Activity

People

Assignee:: Keith Mannthey (Inactive)

Reporter:: Artem Blagodarenko (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 08/Dec/11 4:25 AM

Updated:: 17/Mar/20 8:23 AM

Resolved:: 07/Mar/13 12:10 AM