[LU-17290] Don't deregister idle changelog consumers Created: 15/Nov/23 Updated: 06/Feb/24 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Nathan Rutman | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
In (some of) our customer's experience, we get complaints that restarting their consumers is too "high touch" - they have to interact with each MDT manually to re-register a new consumer ID, then their kubernetes kafka whatever setup to change some config and redeploy etc. They might be ok with missing the records (which happens if they have to reregister or not), they just don't like the additional ID reconfig hassle. Deregistration of an idle changelog consumer is a heavy penalty, requiring a user to re-register and restart their consumer process with a new ID. It would make more sense to mark this consumer internally as "stale" and simply ignore it during the lowest-unconsumed-record check. Then if the consumer does come back to life, we remove the "stale" flag and the consumer still has access to the (remaining) changelog records. If a stale consumer is still alive and connected, it can continue consuming records. (An idle consumer on an idle system would feel no impact.) If disconnected and restarted, a stale consumer would restart with the old ID in llapi_changelog_start(), which would return -ESTALE in this case. Consumers that are aware of this feature can take appropriate action as they need, and then re-start a second time which would then succeed. Old unaware consumers that don't understand ESTALE would presumably fail with the error and require manual intervention, just like current deregistration/reregistration (which would also still work). The important part is that this way, modern consumers can automatically do their recovery without having to do anything special on the MDS itself. See also |
| Comments |
| Comment by Andreas Dilger [ 21/Nov/23 ] |
|
Nathan, it would be useful to know some details about which circumstances the Changelog users are being deregistered. The fact that you are filing a ticket on this would indicate that this has happened more than once, and is a case of the Changelog consumer actually being desirable rather than some dead registration for a test or service that was turned off. How long were the users idle? How much space on the MDT? How many unprocessed records? I'm trying to determine if the Changelog GC is too aggressive or is doing the wrong thing. The user shouldn't be deregistered until the changelog consumes more than half of the remaining space on the MDT (from |
| Comment by Mikhail Pershin [ 23/Nov/23 ] |
|
Concept of idle users is the same as 'users are deregistered only explicitly' In this terms the basis of changelogs is changing - it was 'stream of records are consistent and all users are able to read all records if there are too many recordsm remove idle users, so records first, users second', now proposed concept is different: 'users first, records are second. Keep all users no matter how many records we have, if there are too many records, just kill older of them' While the means are the same - we are killing most older records on per-user basis - the result for consumers are different, they can't expect consistent stream of records anymore, but there can be gaps in stream if user was idle too long or not too long but records were added aggressively. But strictly speaking now it doesn't guarantee constant stream either, user is just dropped, breaking a stream and new registration will start with gap too. The problem is just that now consumer knows the moment of gap, when user is dropped but with new approach it would look like there is no gap, records just continues. Nathan proposes to return -ESTALE looks sufficient to mark that event, I'd just return it always for any new request from client, not just llapi_changelog_start() to let consumer know about gap. Other changes look doable, GC will do the same mostly but keeps idle users as described, idle users are just ignored. The only question remains - when and by whom they will be deregistered after a while? Just to don't have thousands of them in 'changelog_users'. So far it looks like we need manual intervention or GC still. should deregister too old users It worths to mention that currently GC uses 3 thresholds: how long user is idle, how many idle records we have and how big their product: idle time * idle records. The last one is to balance situation when aggressive records adding can cause GC for quite recent users, on other hand exactly that check may cause user deregister earlier than idle threshold and that third condition is very heuristic right now and can be quite aggressive sometime, we can get rid of it with these idle users proposal it seems |
| Comment by Andreas Dilger [ 07/Dec/23 ] |
|
Nathan, do you have any plans for implementing this? I think the consensus is that the proposed change makes sense. |
| Comment by Nathan Rutman [ 06/Feb/24 ] |
|
yes to your question Andreas, we have this as a task in our Jira (LUS-11978), but I don't get to assign tasks... I'll kick it again. |