[LU-16321] Rewrite regression discovered using obdfilter-survey on OST with Lustre 2.15 but also impacts fiemap logic performance Created: 17/Nov/22  Updated: 07/Jan/23  Resolved: 07/Jan/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Major
Reporter: Petros Koutoupis Assignee: Petros Koutoupis
Resolution: Fixed Votes: 0
Labels: None

Severity: 2
Rank (Obsolete): 9223372036854775807

 Description   

On 2.12, we observe a 30 GiB/s throughput, while on 2.15, we see 25 GiB/s.

Both perf captures and ftrace data (with the help of git blame) have led me down the path to a root cause:

commit d0337cab8e845efcdbfb9e26e573feb18f28e303
Author: Mr NeilBrown <neilb@suse.de>
Date:   Wed Dec 9 13:00:16 2020 +1100
 
    LU-14195 osd: don't use set_fs() for ->fiemap() calls.
 
    ->fiemap() only accesses kernel-space data, so does not need, and
    never has needed, set_fs() calls.
    In Linux 5.10, these calls are deprecated.
    So remove the unnecessary code.

If I revert the above commit for LU-14195, the obdfilter-survey rewrite throughput is immediately restored.

Reverting this commit, performance is actually a bit improved to 31 GiB/s.

Note - Example obdfilter-survey command:

 tests_str="write rewrite" nobjlo=1 nobjhi=1 thrlo=1024 thrhi=1024 rszlo=4096 rszhi=4096 size=524288 obdfilter-survey

After speaking with Alexander Boyko about this exact issue and the impact it will have once we move to a post 5.10 Linux kernel:

Petros Koutoupis  8:06 AM
Hello. I do have a question though. Does it make sense to you how d0337cab8 would impact the obdflilter-survey rewrites (by that much too)?

Alexander Boyko  8:18 AM
yeap I had a discussion with team. obdfilter executes from ioctl and doesnot have KERNEL_DS

Petros Koutoupis  8:18 AM
I ask only because in 5.10 set_fs is deprecated
ok

Alexander Boyko  8:20 AM
so without a revert it fall to copy_from_user, it returns EFAULT and fiemap handle this as worst case
this should not affect general Lustre IO 

More comments from Alexander:

Unfortunately the LU-14195 patch has a more serious impact on whole Lustre, not only obdfilter-survey. Yesterday I discussed it with Andrew Perepechko, the current idea is

 ldiskfs_fiemap..()->..->copy_to_user() 

returns EFAULT. EFAULT leads to unmapped block logic, so Lustre would call block allocator, calculate grants, call quota logic for overwrite. All Lustre ioctl requests are affected. There is no simple way for fixing it at new kernels, since all parts with set_fs() were changed to ITER_KVEC. But using set_fs() for fiemap is only Lustre trick, and internal fiemap logic does not support iterator. I think LU-14195 should be reverted until 5.x kernel.



 Comments   
Comment by Gerrit Updater [ 18/Nov/22 ]

"Shaun Tancheff <shaun.tancheff@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49190
Subject: LU-16321 osd: Allow fiemap on user and kernel buffers
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 888f48ba31996d7183f5a0cce296198ad09fabcc

Comment by Gerrit Updater [ 07/Jan/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49190/
Subject: LU-16321 osd: Allow fiemap on kernel buffers
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 5cd5a49c7213315d83bca54a2eee97ba57d256cc

Comment by Peter Jones [ 07/Jan/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:25:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.