[LU-10704] open(2) with O_CREAT takes 60s (timeout) with virus scanning application Created: 23/Feb/18  Updated: 17/Aug/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Upstream
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Li Dongyang (Inactive) Assignee: Dongyang Li
Resolution: Unresolved Votes: 0
Labels: patch

Attachments: HTML File debug-log    
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

On the client when Lustre creates a file, we create the inode and instantiate the dentry in ll_create_it(), it's still hidden by lld->lld_invalid. Next lookup could discover it, however this brings an issue for some applications. like virus scanning, for example https://www.f-secure.com/en/web/business_global/downloads/linux-security

f-secure does real time scanning using fanotify. When client creates a file using open(2) with O_CREAT, the process will wait in fanotify until f-secure scanning thread does its job:

[<ffffffff8124b785>] fanotify_handle_event+0x1e5/0x330
[<ffffffff81247e15>] fsnotify+0x285/0x510
[<ffffffff812b4936>] security_file_open+0x66/0x70
[<ffffffff81200539>] do_dentry_open+0xb9/0x2e0
[<ffffffff8120078f>] finish_open+0x2f/0x40
[<ffffffffc0a31506>] ll_atomic_open+0x1d6/0x11f0 [lustre]
[<ffffffff812121bd>] do_last+0xa4d/0x12c0
[<ffffffff81212af2>] path_openat+0xc2/0x490
[<ffffffff8121508b>] do_filp_open+0x4b/0xb0
[<ffffffff81201bc3>] do_sys_open+0xf3/0x1f0
[<ffffffff81201cde>] SyS_open+0x1e/0x20
[<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

The scanning thread will try to do lookup, the dcache lookup will fail because lld->lld_invalid, so it tries lookup_real, which requires i_mutex holding by the thread creating the file and waiting for the scanning thread.

Eventually scanning thread will be killed by f-secure after the timeout, and file creation could finish. Any thread tries to take i_mutex on the parent will block during the timeout.



 Comments   
Comment by Gerrit Updater [ 23/Feb/18 ]

Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/31390
Subject: LU-10704 llite: unhide the dentry for O_CREAT
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b3e75d0570349547dd1d9f618ef99a196977d37c

Comment by Lai Siyao [ 23/Feb/18 ]

IMHO the root cause is creating thread holds i_mutex and waiting, which should release i_mutex and then wait.

Comment by Li Dongyang (Inactive) [ 05/Mar/18 ]

Attached debug log with dentry dlmtrace and rpctrace

Comment by Lai Siyao [ 05/Mar/18 ]

I just updated the patch, please take a try.

Generated at Sat Feb 10 02:37:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.