Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.4.0, Lustre 1.8.6
-
SLES11
-
3
-
24,555
-
5140
Description
Reproduction script is described in https://bugzilla.lustre.org/show_bug.cgi?id=24555
After some analysis the next bug picture drown:
1.
First thread makes lookup. Gets CR lock and terminates.
After that another thread a) make clear_lru cache b) sysctl that flush slab and another kernel
caches (dcache, icache, etc ...)
This results that sequence "shrink_dcache_memory -> "foreach_dentry_lru" -> prune_one_dentry ->
d_kill > d_iput()" is executed. After that ll_clear_inode executed that NULLed lock>l_ast_data.
2.
Some time after 1. another thread make get_attr on same inode. Gets another IBIT lock, but LOOKUP +
UPDATE.
Another client need cancel this lock, but 2 BL AST race arised. Second lock can't cancel first
lock because optimization, that was sown bellow and fist lock can't be canceled because its
inode == NULL.
Optimisation:
int ll_mdc_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc, void *data, int flag) ... if ((bits & MDS_INODELOCK_LOOKUP) && ll_have_md_lock(inode, MDS_INODELOCK_LOOKUP, LCK_MINMODE)) bits &= ~MDS_INODELOCK_LOOKUP; if ((bits & MDS_INODELOCK_UPDATE) && ll_have_md_lock(inode, MDS_INODELOCK_UPDATE, LCK_MINMODE)) bits &= ~MDS_INODELOCK_UPDATE; if ((bits & MDS_INODELOCK_OPEN) && ll_have_md_lock(inode, MDS_INODELOCK_OPEN, mode)) bits &= ~MDS_INODELOCK_OPEN; ... if (inode->i_sb->s_root && inode != inode->i_sb->s_root->d_inode && (bits & MDS_INODELOCK_LOOKUP)) ll_unhash_aliases(inode); iput(inode);
Attachments
Issue Links
- Trackbacks
-
Lustre 1.8.x known issues tracker While testing against Lustre b18 branch, we would hit known bugs which were already reported in Lustre Bugzilla https://bugzilla.lustre.org/. In order to move away from relying on Bugzilla, we would create a JIRA