[LU-15155] Add lock request to readahead Created: 23/Oct/21 Updated: 11/Aug/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Patrick Farrell | Assignee: | Patrick Farrell |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | readahead | ||
| Issue Links: |
|
||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Currently, readahead will not request an LDLM lock if it Not requesting locks for readahead is an artifact of the This will help cut down misses when starting to read a new The benefit can be seen in the test changes - miss counts |
| Comments |
| Comment by Patrick Farrell [ 23/Oct/21 ] |
| Comment by Andreas Dilger [ 24/Oct/21 ] |
|
One of the reasons that readahead does not take DLM locks is to avoid contention in the case of shared file access. In the uncontended read case, the client should already have prefetched the full-file read locks due to a glimpse, or at worst one serial read from each stripe of the file (though that could be improved). If the client doesn't have a local lock, there is probably some reason for that? In particular, when IOR is doing interleaved reads, this can result in client nodes prefetching far too much data, when each client is only intended to read a part of the file. That itself may be an artifact of our readahead algorithm being too aggressive in expanding the window ahead of the actual reader? In order for this to work properly, I think we need it to be a bit smarter than just "always get a read lock when doing readahead". As a starting point, does it make sense to prefetch a limited lockahead lock if the client has detected readahead? In that case, if the client gets the lock it is because there is no contention, yet it avoids canceling write locks on other clients if they are still writing to the file. |
| Comment by Patrick Farrell [ 24/Oct/21 ] |
|
Well, so the IOR case is going to be fine - we only request locks for regions we desire to read (with readahead), and shared IOR will be strided. Readahead picks up the strided pattern and asks for locks on those regions. The idea of contention presumes a mixed read-write workload and one where readahead isn’t predicting correctly. If it’s predicting correctly, then we’re asking for locks slightly early but no more than that. So I think this would only apply for a random mixed read write workload where readahead is still triggering but wrongly. That would be made worse by this. But otherwise, if readahead is working correctly, we’re only asking for locks on regions are going to read anyway. To be fair about this - I came up with this partly because it makes it much easier to write the tests. Otherwise you end up taking misses on the first access to each stripe with a very complex relationship between misses, stripe count, and access pattern. (When read size is >> stripe size, it can be multiple misses from the first access. It is very complicated.) Taking a lockahead type lock wouldn’t avoid that, and I disliked the idea of doing something special for the tests so they’d have locks in place. I first really noticed this limitation when I realized it hobbled whole file read testing - we’d read twice from the first stripe and then be unable to pull in the rest with readahead because we lacked the locks. |
| Comment by Patrick Farrell [ 24/Oct/21 ] |
|
Sorry, I just realized I’m wrong about one thing in there and it matters. Lockahead locks are usually an async non-blocking (non-cancelling really) lock request. It’s the async part that I don’t think is so good here - it means we still have to deal with all of that complex miss behavior. But we can also just do synchronous non-blocking. If we did non-blocking, I’d also need to add something so we only requested it once per readahead, so we didn’t request it (and fail) for every page in a region. But yeah - this being a problem is based around a workload where there’s naturally a high degree of conflict and readahead is being activated but failing to catch a pattern.
|
| Comment by Andreas Dilger [ 24/Oct/21 ] |
|
I'm not against doing opportunistic read locking in the case where the client is properly doing readahead. If the client has a full-object lock on the current stripe, or al least the lock was expanded by one OST to cover an extent more than one stripe size, then it seems likely that it can get full-object locks for the other stripes as well (ie. there is no contention on the file, or at least not that region. I do find it a bit odd that there wouldn't be an initial glimpse for all stripes to get locks for all stripes at open time? I've found that annoying in the write case, that the client glimpses all of the stripe locks in PR mode, only to have to cancel and refresh them in PW mode immediately. I have a prototype PW glimpse patch when files are opened for write that I should push for that. |