[LU-14520] shrink ldlm_lock to fit in 512 bytes Created: 13/Mar/21  Updated: 16/Jul/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Rank (Obsolete): 9223372036854775807

 Description   

In patch https://review.whamcloud.com/39811 it removes l_lock from struct ldlm_lock, which also removes a 4-byte hole following l_lock, giving an 8-byte size reduction down to 544 bytes. It would be nice if we could find another 32 bytes of savings so that each lock could fit into a 512-byte allocation.

While there isn't yet a clear path to the full reduction, there are a number of fields that are too large and could be shrunk

  • l_bl_ast_run looks like it could be a single bit (4 bytes if packed somewhere else)
  • l_lvb_len looks like it could just fit into a __u16 since this is limited (2 bytes)
    by the maximum layout size, itself capped by XATTR_SIZE_MAX less a small amount for the xattr header
  • l_lvb_type only needs 4 bits (4 bytes if packed somewhere else)
  • l_req_mode and l_granted_mode basically never change on a lock, and could fit into 8 bits by declaring the enum with a field width :8 (6 bytes)
  • l_readers and l_writers are mostly accessed as booleans, and while they still need to be counters I don't think we'll ever have 4B threads in the kernel accessing the same lock at one time. These could easily be shrunk to 24 bits (16M concurrent threads) or even a bit smaller to hold the few other bitfields (I'm not sure I'd be comfortable with only 64K threads since this might happen in a big NUMA machine), with l_req_mode and l_granted_mode put into the high bits. (2 bytes)

That is 18 bytes, so if we could find a couple of the many {{list_head}}s that could be shared it would be enough. That would save many MB of RAM on the servers.



 Comments   
Comment by Neil Brown [ 14/Mar/21 ]

h_rcu could be moved out of portals_handle, and unioned with something elsewhere that is completely irrelevant when the structure is being freed.

l_waitq could be discarded and 'wait_var_event()' used instead.  If there are often lots (thousands?) of ldlm_locks all with active wait queues at the same time, this might not be a good idea.

There seem to be some fields only used for bit locks (l_sl_mode, l_client_cookie) and some only used for extent locks (l_req_extent).  These could be unioned together.

 

And as you say, there are lots of list_heads...

 

Comment by Andreas Dilger [ 16/Jul/21 ]

On my current test system (2.14.51_86_gd656423, EL7) the ldlm_lock slab is using 576 bytes per object (even though struct ldlm_lock is 544 bytes), with 28 locks in a 4-page slab. I'm assuming the extra 32 bytes per object is internal slab overhead/debug... We don't strictly need to shrink all the way down to 512 bytes/lock, it would be enough to remove just a few bytes per lock to allow this to have 29 or 30 locks per slab.

Generated at Sat Feb 10 03:10:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.