[LU-11199] mdsrate open() performance degradation Created: 02/Aug/18  Updated: 13/Oct/18  Resolved: 13/Oct/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.0

Type: Bug Priority: Major
Reporter: Patrick Farrell (Inactive) Assignee: Patrick Farrell (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-10269 Fixes for selective trybits Resolved
is related to LU-10957 Return more lock bits from MDS for op... Open
is related to LU-10948 client cache open lock after N opens Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

A change in https://review.whamcloud.com/#/c/30246/

(LU-10269 ldlm: fix the issues introduced by try bits)

introduced a major regression in open() performance in at least an mdsrate open() benchmark:

On a real system, this mdsrate open benchmark drops from 35K opens/second to around 11K.
This is 10K opens per process, 64 processes, opens are random files from among 300000 existing files (created by mdsrate earlier): 
aprun -n 64 /usr/lib64/lustre/tests/mdsrate -d /mnt/lustre/mdsrate --open --iters 10000 --nfile=300000

On a much smaller VM, I see a drop from 8K opens/second to 4K with this benchmark.
This is 8K opens per process, 4 processes, opens randomly selected from among 30000 existing files:
mpirun -n 4 /usr/lib64/lustre/tests/mdsrate -d /mnt/lustre/mdsrate --open --iters 8000 --nfile=30000

The specific change is no longer attempting to grant the LOOKUP lock on opens.

Oleg has a patch in flight which reverses this change, but also includes a number of other attempted optimizations:
https://review.whamcloud.com/#/c/32156/

Andreas suggested I push the simpler patch to revert, with the idea that that patch can land quickly and Oleg's further optimizations can be considered separately.



 Comments   
Comment by Gerrit Updater [ 02/Aug/18 ]

Patrick Farrell (paf@cray.com) uploaded a new patch: https://review.whamcloud.com/32929
Subject: LU-11199 mdt: Attempt lookup lock on open
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 3061c2836db7f917a0dc394fba9cc3fe32b0d4c9

Comment by Gerrit Updater [ 12/Oct/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32929/
Subject: LU-11199 mdt: Attempt lookup lock on open
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 8b9105d8281f10b500d47a00458631a586c7f1d4

Comment by Peter Jones [ 13/Oct/18 ]

Landed for 2.12

Generated at Sat Feb 10 02:41:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.