[LU-1777] open-by-fid: deadlock in lock_rename() Created: 21/Aug/12 Updated: 20/Jul/17 |
|
| Status: | Reopened |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | John Hammond | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | open-by-fid | ||
| Environment: |
|
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 10464 | ||||||||
| Description |
[root]# /usr/src/lustre-release/lustre/tests/llmount.sh [root]# cd /mnt/lustre/ [root]# mkdir sanity [root]# chown sanity: sanity [root]# su sanity [sanity]$ pwd /mnt/lustre [sanity]$ sys_path2fid . [0x61ab:0xef3d87c8:0x0] [sanity]$ sys_rename sanity .lustre/fid/[0x61ab:0xef3d87c8:0x0]/sanity rename() wedges in lock_rename(). INFO: task sys_rename:2960 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. sys_rename D 0000000000000000 0 2960 2933 0x00000080 ffff88005cc37cf8 0000000000000082 ffff88005cc37d08 ffffffff81189a05 0000001000000000 ffff88007b01cb70 ffffffff8100bc0e ffff88005cc37cf8 ffff880062a67098 ffff88005cc37fd8 000000000000fb88 ffff880062a67098 Call Trace: [<ffffffff81189a05>] ? __link_path_walk+0x155/0x1030 [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 [<ffffffff8104f18b>] ? mutex_spin_on_owner+0x9b/0xc0 [<ffffffff814ff2fe>] __mutex_lock_slowpath+0x13e/0x180 [<ffffffff814ff19b>] mutex_lock+0x2b/0x50 [<ffffffff811878e3>] lock_rename+0x73/0xe0 [<ffffffff8118af83>] sys_renameat+0x113/0x260 [<ffffffff8119a470>] ? mntput_no_expire+0x30/0x110 [<ffffffff8117cb11>] ? __fput+0x1a1/0x210 [<ffffffff81142c7e>] ? remove_vma+0x6e/0x90 [<ffffffff810d6b12>] ? audit_syscall_entry+0x272/0x2a0 [<ffffffff815036de>] ? do_page_fault+0x3e/0xa0 [<ffffffff8118b0eb>] sys_rename+0x1b/0x20 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b [root]# pidof sys_rename 2960 [root]# cat /proc/2960/stack [<ffffffff811878e3>] lock_rename+0x73/0xe0 [<ffffffff8118af83>] sys_renameat+0x113/0x260 [<ffffffff8118b0eb>] sys_rename+0x1b/0x20 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff |
| Comments |
| Comment by Peter Jones [ 27/Aug/12 ] |
|
Niu Could you please look at this one? It is similar to the work you just did for LU1518 Thanks Peter |
| Comment by Niu Yawei (Inactive) [ 27/Aug/12 ] |
|
this should be fixed along with |
| Comment by Peter Jones [ 27/Aug/12 ] |
|
ok then let's close this ticket as a duplicate and just ensure that the LU1518 fix cover this case also |
| Comment by Peter Jones [ 01/Sep/12 ] |
|
As per John this was not fixed by the |
| Comment by Niu Yawei (Inactive) [ 03/Sep/12 ] |
|
I think there isn't a quick fix for such deadlock. We need some way on server side to detect the recursive rename, which should check the 'fid' directory as well. Given that rename files in the 'fid' directory isn't an legal usage, I suggest let's lower the priority of this ticket and fix it in later version. |
| Comment by Andreas Dilger [ 03/Sep/12 ] |
|
While it would be good to get this fixed for 2.3, since this only affects the client and not the MDS, I'm removing this as a blocker for 2.3 and moving it to 2.4. This isn't a problem that can be hit accidentally. |
| Comment by Niu Yawei (Inactive) [ 10/Oct/12 ] |
|
The client is dealock in lock_rename(): struct dentry *lock_rename(struct dentry *p1, struct dentry *p2)
{
struct dentry *p;
if (p1 == p2) {
mutex_lock_nested(&p1->d_inode->i_mutex, I_MUTEX_PARENT);
return NULL;
}
mutex_lock(&p1->d_inode->i_sb->s_vfs_rename_mutex);
p = d_ancestor(p2, p1);
if (p) {
mutex_lock_nested(&p2->d_inode->i_mutex, I_MUTEX_PARENT);
mutex_lock_nested(&p1->d_inode->i_mutex, I_MUTEX_CHILD);
return p;
}
p = d_ancestor(p1, p2);
if (p) {
mutex_lock_nested(&p1->d_inode->i_mutex, I_MUTEX_PARENT);
mutex_lock_nested(&p2->d_inode->i_mutex, I_MUTEX_CHILD);
return p;
}
mutex_lock_nested(&p1->d_inode->i_mutex, I_MUTEX_PARENT);
mutex_lock_nested(&p2->d_inode->i_mutex, I_MUTEX_CHILD);
return NULL;
}
The root cause is that with 'fid' directory, we can have two directory dentries pointing to the same inode on client, so lock_rename() will try to lock the same inode from two different dentries twice. Without patching kernel, I'm not sure if there is any good way to solve it. Anyway, I don't think it should be a blocker for 2.4. Andreas, any comments? Thanks. |
| Comment by Andreas Dilger [ 19/Oct/12 ] |
|
Is it possible to block renames that involve the .lustre directory? |
| Comment by Niu Yawei (Inactive) [ 19/Oct/12 ] |
|
No, I don't think so. It block renames that involve the 'fid' directory. |
| Comment by parinay v kondekar (Inactive) [ 03/Aug/15 ] |
|
@Niu Yawei, Jul 31 16:10:01 localhost kernel: mrename D 0000000000000000 0 19394 19337 0x00000080 Jul 31 16:10:01 localhost kernel: ffff8800054f1cf8 0000000000000082 0000000000000000 ffffffff811850f5 Jul 31 16:10:01 localhost kernel: 0000000000000000 ffffea00001267b8 ffffffff8100bc0e ffff8800054f1cf8 Jul 31 16:10:01 localhost kernel: ffff88000dffc678 ffff8800054f1fd8 000000000000f4e8 ffff88000dffc678 Jul 31 16:10:01 localhost kernel: Call Trace: Jul 31 16:10:01 localhost kernel: [<ffffffff811850f5>] ? __link_path_walk+0x155/0x1030 Jul 31 16:10:01 localhost kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 Jul 31 16:10:01 localhost kernel: [<ffffffff8104d92d>] ? mutex_spin_on_owner+0x8d/0xc0 Jul 31 16:10:01 localhost kernel: [<ffffffff814eebbe>] __mutex_lock_slowpath+0x13e/0x180 Jul 31 16:10:01 localhost kernel: [<ffffffff81183b01>] ? path_put+0x31/0x40 Jul 31 16:10:01 localhost kernel: [<ffffffff814eea5b>] mutex_lock+0x2b/0x50 Jul 31 16:10:01 localhost kernel: [<ffffffff81182f83>] lock_rename+0x73/0xe0 Jul 31 16:10:01 localhost kernel: [<ffffffff81186673>] sys_renameat+0x113/0x260 Jul 31 16:10:01 localhost kernel: [<ffffffff81195b70>] ? mntput_no_expire+0x30/0x110 Jul 31 16:10:01 localhost kernel: [<ffffffff81178271>] ? __fput+0x1a1/0x210 Jul 31 16:10:01 localhost kernel: [<ffffffff8113f43e>] ? remove_vma+0x6e/0x90 Jul 31 16:10:01 localhost kernel: [<ffffffff810d4932>] ? audit_syscall_entry+0x272/0x2a0 Jul 31 16:10:01 localhost kernel: [<ffffffff814f2fce>] ? do_page_fault+0x3e/0xa0 Jul 31 16:10:01 localhost kernel: [<ffffffff811867db>] sys_rename+0x1b/0x20 Jul 31 16:10:01 localhost kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b This is reproducible every time I run sanity/154a. This is lustre 2.1.5 ( esp with backports of Thanks |
| Comment by Vinayak (Inactive) [ 18/Sep/15 ] |
|
Hello Andreas, Niu Yawei, I have also faced this dead lock while renaming .lustre to .lustre using its fid. i.e echo "rename .lustre to itself" call trace. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mrename D 0000000000000000 0 25974 25917 0x00000080 ffff880014401cf8 0000000000000086 ffff880014401d08 ffffffff811850f5 0000000000000000 ffffea00004fe840 ffffffff8100bc0e ffff880014401cf8 ffff8800054afa78 ffff880014401fd8 000000000000f4e8 ffff8800054afa78 Call Trace: [<ffffffff811850f5>] ? __link_path_walk+0x155/0x1030 [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 [<ffffffff8104d92d>] ? mutex_spin_on_owner+0x8d/0xc0 [<ffffffff814eebbe>] __mutex_lock_slowpath+0x13e/0x180 [<ffffffff81183b01>] ? path_put+0x31/0x40 [<ffffffff814eea5b>] mutex_lock+0x2b/0x50 [<ffffffff81182f83>] lock_rename+0x73/0xe0 [<ffffffff81186673>] sys_renameat+0x113/0x260 [<ffffffff81195b70>] ? mntput_no_expire+0x30/0x110 [<ffffffff81178271>] ? __fput+0x1a1/0x210 [<ffffffff8113f43e>] ? remove_vma+0x6e/0x90 [<ffffffff810d4932>] ? audit_syscall_entry+0x272/0x2a0 [<ffffffff814f2fce>] ? do_page_fault+0x3e/0xa0 [<ffffffff811867db>] sys_rename+0x1b/0x20 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b I have tried to catch this issue in llite layer and return -EPERM from there but not successful. Is this case not currently not supported by lustre or Am i doing something wrong here ? |