Data-on-MDT phase II (LU-10176)

[LU-11428] Writeback on close for DoM Created: 25/Sep/18  Updated: 20/Oct/20

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Technical task Priority: Minor
Reporter: Jinshan Xiong Assignee: Mikhail Pershin
Resolution: Unresolved Votes: 0
Labels: DoM2

Issue Links:
Related
is related to LU-12325 Downgrade lock mode for DOM files whe... Open
Rank (Obsolete): 9223372036854775807

 Description   

Writeback on close is to piggyback the dirty data for DoM files in the close RPC if the data could fit into the inline buffer. This way it should be able to improve write on small files significantly.

I have had this idea before and I have seen this problem in my recent test. The writeback to small files are really slow and no matter how large the number I set it to max_rpcs_in_flight of mdc, it could simply max out. Small RPCs are expensive. An alternative solution would be to have compound RPC to merge those small RPCs but it would introduce more issues. The easier solution is to have writeback on close.



 Comments   
Comment by Mikhail Pershin [ 27/Sep/18 ]

Yes, this optimisation is in my list too though I wasn't thinking about details so far. The solution with inline data buffer on close looks easier than read-on-open because we can allocate buffer on needed size up to reasonable maximum. The problem can be the buffer preparation on a client probably. Do you have an idea how to combine that with CLIO?

Comment by Joseph Gmitter (Inactive) [ 03/Oct/18 ]

Jinshan,

Any thoughts on the above?

Thanks.

Joe

Comment by Jinshan Xiong [ 04/Oct/18 ]

That's pretty much what OSC is currently doing right now. In the close handling on the MDC layer, it will call routines like cl_page_make_ready() to clear page Dirty bit and put page into writeback state. After close RPC is complete, it will clear page writeback bit.

I didn't realize there is anything special for this, but I'm pretty sure there will be some issue when this is implemented. We can discuss further that time then.

Comment by Patrick Farrell (Inactive) [ 20/Feb/19 ]

So this would cut RPC counts by half and should reduce the RPC processing time for the write since it's inline rather than RDMA, but how big will this effect be relative to the writing itself?  It sounds like in your testing, Jinshan, you were unable to keep the MDS busy because of rpc_in_flight limits.

It seems like we should do this, but also raise the RPC in flight limit, unless the MDS CPU/disk was fully busy (which it sounds like it wasn't).  And it sounds like raising the RPC in flight limit for the MDS might be a cheap win here.

Generated at Sat Feb 10 02:43:47 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.