[LU-12977] fix i_mutex for ldiskfs_truncate() in osd_execute_truncate() Created: 15/Nov/19 Updated: 24/Sep/21 Resolved: 28/Jan/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Andreas Dilger | Assignee: | James A Simmons |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||||||
| Description |
|
The inode->i_mutex should be held when calling osd_execute_truncate->ldiskfs_truncate(). However, due to locking order in osd-ldiskfs we cannot just grab i_mutex around ldiskfs_truncate() as that can lead to deadlocks as described in Currently, there is an ldiskfs patch to remove the WARN_ON() in ext4_truncate() when it is called without i_mutex held. This is safe because the osd-ldiskfs code holds its own lock (oo_ext_idx_sem) on the object over the ldiskfs_truncate() call, but potentially there is a way to improve this without patching the ldiskfs code. The truncate-warning patch was first added with: commit 3daba43a191a0a2c9358d39b3f48b1b759a41819
Author: Yang Sheng <yang.sheng@intel.com>
LU-6446 ldiskfs: remove WARN_ON from ldiskfs_orphan_add{del}
RHEL7.1 kernel carefully check i_mutex whether locked in
many places. It will be triggered while ldiskfs_truncate()
was invoked in osd layer. A dead lock would occured if we
just locked i_mutex around it since Lustre using a reverse
order for start journal==>lock i_mutex. Consider Lustre has
own ldlm lock. So just remove such messages in ldiskfs patches.
Reviewed-on: http://review.whamcloud.com/14690
so I think moving the lock is not as easily done as what is done in the current patch, see comments in
if (!(inode->i_state & (I_NEW|I_FREEING)))
mutex_lock(&inode->i_mutex);
ldiskfs_truncate(inode);
if (!(inode->i_state & (I_NEW|I_FREEING)))
mutex_unlock(&inode->i_mutex);
but this was removed as part of the patch: commit b98792ae96ad2bb3c73ca3ebebdd8edfdfbc35df
Author: yangsheng <yang.sheng@intel.com>
LU-4252 osd: remove locking i_mutex in osd_punch
This piece of code intent to calm down the kernel
WARN_ON messsage show up since 3.9 upstream. So we
can just remove it for now. We'll add a ldiskfs
patch to resolve this issue while 3.9 server support
come up.
Reviewed-on: http://review.whamcloud.com/9644
In more recent developments, commit cf29a5e7bfa254ccfcea023028fe7da80503c512
Author: Alex Zhuravlev <alexey.zhuravlev@intel.com>
LU-10048 osd: async truncate
osd-ldiskfs should execute truncate outside of main transaction
handle. This avoids restarting truncate transaction handles in
main transaction, and allows "transaction first, locking second"
model on OST.
Reviewed-on: https://review.whamcloud.com/27488
It may now be OK to grab i_mutex around ldiskfs_truncate() in osd_execute_truncate() and remove ext4-remove-truncate-warning.patch, but this should be reviewed and tested properly (e.g. with the racer workload that triggered |
| Comments |
| Comment by Gerrit Updater [ 30/Dec/19 ] |
|
James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/37116 |
| Comment by Gerrit Updater [ 28/Jan/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37116/ |