[LU-8276] Make lru clear always discard read lock pages Created: 14/Jun/16  Updated: 24/Oct/17  Resolved: 28/Aug/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.11.0, Lustre 2.10.2

Type: Improvement Priority: Minor
Reporter: Patrick Farrell (Inactive) Assignee: Patrick Farrell (Inactive)
Resolution: Fixed Votes: 0
Labels: patch

Issue Links:
Related
is related to LU-7802 set_param lru_size fails with 'error:... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

A significant amount of time is sometimes spent during
lru clearing (IE, echo 'clear' > lru_size) checking
pages to see if they are covered by another read lock.
Since all unused read locks will be destroyed by this
operation, the pages will be freed momentarily anyway,
and this is mostly a waste of time.

(Time is spent specifically in ldlm_lock_match, trying to
match these locks.)

So, in the case of echo clear > lru_size, we should not
check for other covering read locks before attempting to
discard pages.

We do this by using the LDLM_FL_DISCARD_DATA flag, which is
currently used for special cases where you want to destroy
the dirty pages under a write lock rather than write them
out.

We set this flag on all the PR locks which are slated for
cancellation by ldlm_prepare_lru_list (when it is called
from ldlm_ns_drop_cache).

The case where another lock does cover those pages (and is
in use and so does not get cancelled) is safe for a few
reasons:

1. When discarding pages, we wait (discard_cb->cl_page_own)
until they are in the cached state before invalidating.
So if they are actively in use, we'll wait until that use
is done.

2. Removal of pages under a read lock is something that can
happen due to memory pressure, since these are VFS cache
pages. If a client reads something which is then removed
from the cache and goes to read it again, this will simply
generate a new read request.

This has a performance cost for that reader, but if anyone
is clearing the ldlm lru while actively doing I/O in that
namespace, then they cannot ask for good performance.

In the case of many read locks on a single resource, this
improves cleanup time dramatically. In internal testing at
Cray using unusual read/write I/O patterns to create
~80,000 read locks on a single file, this improves cleanup
time from ~60 seconds to ~0.5 seconds. This also slightly
improves cleanup speed in the more normal case of a 1 or
very few read locks on a file.



 Comments   
Comment by Gerrit Updater [ 14/Jun/16 ]

Patrick Farrell (paf@cray.com) uploaded a new patch: http://review.whamcloud.com/20785
Subject: LU-8276 ldlm: Make lru clear always discard read lock pages
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2c274dacff16370b5fc0b4d587ead14afcc6c17a

Comment by Gerrit Updater [ 28/Aug/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/20785/
Subject: LU-8276 ldlm: Make lru clear always discard read lock pages
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6a369b59f3729513dd8e81c4964dc6183287b601

Comment by Peter Jones [ 28/Aug/17 ]

Landed for 2.11

Comment by Gerrit Updater [ 29/Sep/17 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/29264
Subject: LU-8276 ldlm: Make lru clear always discard read lock pages
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: c2f5f56a9e90146cf82b09729c47c1e3c1f3eb7b

Comment by Patrick Farrell (Inactive) [ 29/Sep/17 ]

Minh,

I'm curious why this one was targeted for stable? It's a performance improvement rather than, say, a bug fix.

Just wondering, thanks.

Comment by Minh Diep [ 03/Oct/17 ]

paf, I believe it's a dependency for a patch on top (I don't recall the number). We'll take another look soon

Comment by Gerrit Updater [ 24/Oct/17 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/29264/
Subject: LU-8276 ldlm: Make lru clear always discard read lock pages
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: d5c583291498d26f5f5634b8f3463bbfe7109f1e

Generated at Sat Feb 10 02:16:08 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.