[LU-14639] confusion of lru_size=0 if lru-resize disabled Created: 24/Apr/21  Updated: 30/May/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.15.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Shuichi Ihara Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

master(commit:gdfe87b0)


Issue Links:
Related
is related to LU-11077 Client-specific tunable parameter con... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

There is an configure option to enable/disable lru-resize (--enable-lru-resize/
--disable-lru-resize), but that confuses when lru_size set 0.
In both cases, it's able to set 0 to lru_size, but they make different behaviors.

And currently, if lru-resize disabled(--disable-lru-resize) when lustre builds, it won't be able to re-enable lru-resize even lru_size set 0.

./configure --enable-lru-resize

[root@sky06 ~]# mount -t lustre 10.0.11.110@o2ib10:/ai7990 /ai7990
mount -t lustre 10.0.11.110@o2ib10:/ai7990 /ai7990
mount.lustre: according to /etc/mtab 10.0.11.110@o2ib10:/ai7990 is already mounted on /ai7990
[root@sky06 ~]# lctl get_param ldlm.*.*.lru_size
ldlm.namespaces.MGC10.0.11.110@o2ib10.lru_size=8800
ldlm.namespaces.ai7990-MDT0000-mdc-ffff8f5408893800.lru_size=1
ldlm.namespaces.ai7990-MDT0001-mdc-ffff8f5408893800.lru_size=0
ldlm.namespaces.ai7990-OST0000-osc-ffff8f5408893800.lru_size=0
ldlm.namespaces.ai7990-OST0001-osc-ffff8f5408893800.lru_size=0
[root@sky06 ~]# lctl set_param ldlm.*.*.lru_size=0
ldlm.namespaces.MGC10.0.11.110@o2ib10.lru_size=0
ldlm.namespaces.ai7990-MDT0000-mdc-ffff8f5408893800.lru_size=0
ldlm.namespaces.ai7990-MDT0001-mdc-ffff8f5408893800.lru_size=0
ldlm.namespaces.ai7990-OST0000-osc-ffff8f5408893800.lru_size=0
ldlm.namespaces.ai7990-OST0001-osc-ffff8f5408893800.lru_size=0
[root@sky06 ~]# time find /ai7990/testdir > /dev/null 2>&1

real	0m15.183s
user	0m0.367s
sys	0m6.587s
[root@sky06 ~]# lctl get_param ldlm.*.*.lru_size
ldlm.namespaces.MGC10.0.11.110@o2ib10.lru_size=0
ldlm.namespaces.ai7990-MDT0000-mdc-ffff8f5408893800.lru_size=30502
ldlm.namespaces.ai7990-MDT0001-mdc-ffff8f5408893800.lru_size=0
ldlm.namespaces.ai7990-OST0000-osc-ffff8f5408893800.lru_size=2499
ldlm.namespaces.ai7990-OST0001-osc-ffff8f5408893800.lru_size=2501

./configure --disable-lru-resize

[root@sky06 ~]# mount -t lustre 10.0.11.110@o2ib10:/ai7990 /ai7990
[root@sky06 ~]# lctl get_param ldlm.*.*.lru_size
ldlm.namespaces.MGC10.0.11.110@o2ib10.lru_size=8800
ldlm.namespaces.ai7990-MDT0000-mdc-ffff8ef7ae5ef000.lru_size=8800
ldlm.namespaces.ai7990-MDT0001-mdc-ffff8ef7ae5ef000.lru_size=8800
ldlm.namespaces.ai7990-OST0000-osc-ffff8ef7ae5ef000.lru_size=8800
ldlm.namespaces.ai7990-OST0001-osc-ffff8ef7ae5ef000.lru_size=8800
[root@sky06 ~]# lctl set_param ldlm.*.*.lru_size=0
ldlm.namespaces.MGC10.0.11.110@o2ib10.lru_size=0
ldlm.namespaces.ai7990-MDT0000-mdc-ffff8ef7ae5ef000.lru_size=0
ldlm.namespaces.ai7990-MDT0001-mdc-ffff8ef7ae5ef000.lru_size=0
ldlm.namespaces.ai7990-OST0000-osc-ffff8ef7ae5ef000.lru_size=0
ldlm.namespaces.ai7990-OST0001-osc-ffff8ef7ae5ef000.lru_size=0
[root@sky06 ~]# time find /ai7990/testdir > /dev/null 2>&1
 
real	0m26.491s
user	0m0.358s
sys	0m11.809s
[root@sky06 ~]# lctl get_param ldlm.*.*.lru_size
ldlm.namespaces.MGC10.0.11.110@o2ib10.lru_size=0
ldlm.namespaces.ai7990-MDT0000-mdc-ffff8ef7ae5ef000.lru_size=0
ldlm.namespaces.ai7990-MDT0001-mdc-ffff8ef7ae5ef000.lru_size=0
ldlm.namespaces.ai7990-OST0000-osc-ffff8ef7ae5ef000.lru_size=0
ldlm.namespaces.ai7990-OST0001-osc-ffff8ef7ae5ef000.lru_size=0


 Comments   
Comment by Andreas Dilger [ 29/Mar/23 ]

What is expected to be the right behavior here? --disable-lru-resize removed the LRU resize code completely, so only fixed-size LRU is possible. Then setting lru_size=0 results in no locks being cached on the clients.

Why even build with --disable-lru-resize these days, instead of just setting "lctl set_param -P ldlm.namespaces.<fsname>*.lru_size=500" or similar?

The only option I see is to change --disable-lru-resize to not actually disable the LRU resize code, and instead have it just set a fixed LRU size by default to prevent users from shooting themselves in the foot because they are using old instructions when building clients. I don't think this is documented anywhere, but if it is then it should be removed.

Comment by Shuichi Ihara [ 29/Mar/23 ]

What is expected to be the right behavior here? --disable-lru-resize removed the LRU resize code completely, so only fixed-size LRU is possible. Then setting lru_size=0 results in no locks being cached on the clients.

The problem is that lru_size=0 is configurable regardless client was built with --disable-lru-resize or --enable-lru-resize, but people can't make judge which is which after changed lru_size=0. Even non zero value setting, it's hard to confirm. When client umounts and mounts lustre again, it can confirm by default value (zero or non-zero lru_size).
There are two totally different behaviors, but controlling in same parameter and value.

If client was built with --disable-lru-resize, setting lru_size=0 shouldn't be acceptable?, but other value needs to be defined to give "no locks cache" meaning? e.g. lru_size=false

Why even build with --disable-lru-resize these days, instead of just setting "lctl set_param -P ldlm.namespaces.<fsname>*.lru_size=500" or similar?

This is fine, but it still needs per client setting. client's UUID changes all time doesn't it? e.g. it would limit lru_size only for login or data mover nodes, etc.

Comment by Andreas Dilger [ 29/Mar/23 ]

There was a discussion about having per-client tunables linked to nodemap in LU-11077, or possibly a client-local /etc/lustre/<fsname>-client.params file to set parameters at mount time. However, that has not been implemented.

Shuichi, is the main goal of using --disable-lru-resize to have a different/static LRU size on a small number of clients (e.g. login node or data mover), or is used for all clients in a cluster? Or is there some other issue with LRU resize that means it should be disabled entirely from the code (e.g. jitter on compute nodes, or other reasons to disable the code completely? Is this option widely used for all client builds, or only in specific cases?

I'm wondering if the meaning of the --disable-lru-resize option should be changed from removing the LRU resize code to just changing it to have a constant lru_size value? Is there really a time when "lru_size=0" should mean "cache zero locks" (which would be terrible for performance, as you see here)? If a client should minimize lock cache size, I can't imagine that lru_size=5 or similar would cause many issues, and would at least still allow a few files to re-use locks on the client...

Comment by Shuichi Ihara [ 06/Apr/23 ]

In many cases, reason of '--disable-lru-resize' would have a limit of lock counts in cache per client on entire cluster.
And people cleanup caches after job finishes. (e.g. integrated running 'lctl set_param ldlm.namespaces.*.lru_size=clear' command in job scheduler as a post script)

I'm wondering if the meaning of the --disable-lru-resize option should be changed from removing the LRU resize code to just changing it to have a constant lru_size value?

Indeed, still enabling LRU, but fixed value by default make sense.
It's still possible to change 0 or even more lower/higher value if it needs. And we can also control LRU speed by lru_max_age parameter.

Comment by Patrick Farrell [ 30/May/23 ]

It would be nice to drop the configure option entirely - I don't think changing defaults as a build time option is very good practice, I think it should be done with at runtime with settings unless there's a reason that doesn't work.  It's weird to be able to change something with both build configuration and runtime options.  I think it would be nice to get away from the idea of changing behavior with build time flags if that behavior can also be adjusted at runtime.

I'm guessing though that since we have customers who are using this build option, it would be easier to just change it to set a default value, right?

Comment by Patrick Farrell [ 30/May/23 ]

Actually, I was just thinking about this, and:

I don't think changing the build flag to set a default is a very good idea.  A good default value seems very hard to pick.  Like, what is a good default value for lru_size?  What represents a good compromise between memory usage and performance?  Etc.  The 'correct' value depends on whether it is an MDC or an OSC connection, possibly changing if DOM is in use, and on client and server memory size, etc.  We have lru-resize specifically because the correct value is hard to choose - it is very system specific - so instead we choose the value dynamically.

So I think instead we should encourage people who want to set a specific lru_size to 'do the right thing' by removing the build option.  They will then set the lru_size value they want in their configuration.  This is what the customers using the build option are doing anyway - they built with disable-lru-resize out of caution, but they are all manually setting specific lru_size values.  So the build option is never used by itself anyway.

Comment by Gerrit Updater [ 30/May/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51165
Subject: LU-14639 build: Remove disable-lru-resize config
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 3f457d586f8b01fd1f69e4d761f216535bacb4ca

Comment by Gerrit Updater [ 30/May/23 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51166
Subject: LU-14639 tests: remove disable-lru-resize check
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9e8e99f4f06bdb8bf5610ee7060bd6925c5e8c0a

Generated at Sat Feb 10 03:11:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.