[LU-11134] exclusive layout lock Created: 10/Jul/18  Updated: 03/Aug/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Alex Zhuravlev Assignee: Alex Zhuravlev
Resolution: Unresolved Votes: 0
Labels: SFIO

Rank (Obsolete): 9223372036854775807

 Description   

to be able to write to OST objects the client needs to lock them with extent locks. those are synchronous RPC(s) which slow down workloads like untar where relatively small amount of data needs few sync RPCs (metadata, locks, etc).

the client can be given an exclusive access to layout information making OST locks optional. this can improve small file performance.

 



 Comments   
Comment by Oleg Drokin [ 10/Jul/18 ]

It's important to note that while the layout could be locked on the metadata server we still need to talk to OSTs to cancel out all the currently outstanding OST locks before we are sure the exclusive layout lock is truly exclusive.

Comment by Alex Zhuravlev [ 10/Jul/18 ]

the primary purpose is new files where new layout (or some components) have been just created.

 

Comment by Oleg Drokin [ 10/Jul/18 ]

Unless you obtain such a lock at file creation time, you can never ensure there's no race with some other process taking an extent lock so would always need to make sure.

Comment by Andreas Dilger [ 16/Jul/18 ]

How does this differ from the DoM files taking the full-file lock via IBITS lock? It would be good to have a consistent mechanism for doing this between DoM files and regular files.

Comment by Alex Zhuravlev [ 02/Aug/18 ]

adilger I discussed this with Mike a bit.. it sounds like we could reuse that IBIT lock (taken as PW) for few purposes actually:
1) have an exclusive access to [0; EOF] for newly created files and avoid OST locking (synchronous RPC)

2) cache some trivial changes like UID/GID/time and write them back on close (or upon lock cancellation) to save chmod/chown applied by tar to every file (sometime twice)

 

Comment by Oleg Drokin [ 03/Aug/18 ]

don't use PW, use EX. The reason is CR lock is compatible with PW (I stepped on it working on WBC).

Also I wonder if layout lock is better to do dom content locking too? (was this already discussed somewhere and I missed it?)

To cache uid/gid/... take appropriate bits (also as EX).

All in all it just sounds you want to get EX lock on all/most of the bits at create time and cache a bunch of stuff under it without talking back to server. As conflicts arise you might want to reduce the bits and flush the data in progress (or downgrade the lock mode - but you'd need to do it on all bits at the same time)

Comment by Mikhail Pershin [ 03/Aug/18 ]

Oleg, why should we be incompatible with CR lock? I tend to think it would be even bad to have that lock dropped by CR so I'd say PW is the right thing. As I got it, that lock protects just data range [0; EOF) exclusively and it will not be taken with CR mode, just PR or PW. Correct me here if I am not right. Also I wonder what should cause the cancel for such lock? I suppose IO from other client? Then how that is handled?

As for LAYOUT lock vs DOM lock, LAYOUT lock cannot be used to protect data due to several reason, e.g. it is not staying long with PR/PW mode and data is just not cached on client, it conflict also with LVB stuff on server, both layout and IO (dom) use it but for different reasons and in different ways so we have to separate them, and finally it is becoming too complicated to properly distinguish when we are using LAYOUT for layout work and where for IO-related things, especially with all FLR work. So DOM bits works just like extent lock and is separated from LAYOUT bit and in fact there are no good reasons to combine them in a single bit, it doesn't simplify things but vice versa.

Comment by Mikhail Pershin [ 03/Aug/18 ]

Andreas, Alex, about re-using DOM bit for that purposes with OST files. Note that DOM bit doesn't protect whole file but just DOM stripe. We could count it as 'full range lock' for non-DOM files but what about DOM files? How are we going to distinguish cases whether it is protecting dom stripe or whole file (e.g. if it has both dom and ost stripes)?

Comment by Oleg Drokin [ 03/Aug/18 ]

Mike, we have CR lock issued to clients for lookup and related bits. Generally you want your lookup bit together with other bits to ensure the file could still be found without talking to the server.

About the layout vs dom bit. Yes, certainly staying power of layout lock is low. I guess if you envision massive contended r/w workloads on the same file from multiple clients it would make a difference allowing the writing client to hold on to the dom bit for longer. outside of this scenario I am not sure it would matter much.

Comment by Oleg Drokin [ 03/Aug/18 ]

hm, actually yes - I guess when we want to have other stripes it makes sense to have dom bit separate from layout.

Comment by Mikhail Pershin [ 03/Aug/18 ]

Oleg, considering combined bits case, I still don't get why CR lock should conflict with that exclusive lock Alex is describing. If it is combined with LOOKUP then better to be PW also, so other CR lookups would not drop that combined one for nothing, isn't it?

Comment by Oleg Drokin [ 03/Aug/18 ]

The risk is this:

client 1: get PW LAYOUT lock
client 2: get CR LAYOUT|LOOKUP|UPDATE lock (as part of regular lookup or open)

no conflict, but the client 2 now thinks it has valid layout information while client 1 thinks it's got exclusive access.

Comment by Mikhail Pershin [ 03/Aug/18 ]

indeed, that is the case for layout lock and valid layout information but I was thinking that Alex meant DOM lock.
We have discussed using of DOM bit vs LAYOUT bit today and it seems that LAYOUT lock will work better also due to glimpse handling. Before glimpse will be done the layout lock will be taken and exclusive lock will be revoked automatically. But with DOM lock it is not so simple and glimpse itself will not work just out of box.

Comment by Oleg Drokin [ 03/Aug/18 ]

Yeah, the whole glimpse thing - I had some questions about too. But it again boils down to how much do we expect contended workloads are going to be in the DoM case. If we mostly envision "write once and then only read" workloads, glimpses don't matter.

This is not to say I disagree and I am fine to take layout lock in EX mode if needed as it does simplify some other things too, e.g. on create when the data starts flowing in, and there's too much of it, we can seamlessly decide to do normal striping too.

Generated at Sat Feb 10 02:41:14 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.