Details
-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.15.3
-
3
-
9223372036854775807
Description
For an application running on a lustre client consuming changelogs, I believe the following would be useful. Both of these are available on the MDS via lctl get_param but there is currently no way (that I'm aware of) to get this information on a client node.
On Lustre servers, we avoid running software other than critical management and monitoring tools, for both security and reliability reasons. I believe that is common practice. That means that something like a policy engine should be able to get whatever changelog information it needs on a lustre client (not server) node. Our local concern at the moment is Starfish, but I think below would be equally useful for others.
1. The ability to obtain the list of registered changelog users for a given MDT.
Currently, the application needs the user to specify which changelog user (e.g. "cl2") to use. For cases where the user is only running one application which consumes changelogs, and that application interacts with only one user, this isn't necessary (I suspect this is most cases, but I may be wrong).
In addition, the application currently can't confirm a changelog user is valid. One example we've seen is that it would be useful to be able to detect that the changelog user was deregistered while the application was using it.
2. The ability to learn the next available changelog record ID and the last changelog record ID for a given MDT.
The application could adapt to the number of outstanding changelog records by creating more consumer threads when then are many changelogs available, and reducing thread count when there are few. The application could also better handle errors, such as when changelogs are cleared by a sysadmin to free space while the application is running.
It's possible to obtain the next changelog record if there is one which hasn't yet been cleared, with "lfs changelog | head -n1" or similar. But if there is no changelog record yet to be cleared, that technique doesn't work.
Andreas wrote:
> Mike, what I did see is that "lctl --device changelog_deregister -u test" works to deregister this user, but only the username doesn't work on the client:
>
> # lfs changelog_clear testfs-MDT0000 test 1
> lfs changelog_clear: cannot purge records for 'test': Invalid argument (22)
> # lfs changelog_clear testfs-MDT0000 cl2-test 1
> #
> This should probably be usable with just the username?
The reason that failed is the MDC only knows about numeric changelog user ids. llapi_changelog_clear() takes an argument of the form "cl#-name" or "cl#" and in the former case throws away "-name", passing only "cl#" to the kernel. mdc_changelog.c:chlg_write() then parses the input with this code:
note that it's expecting the changelog user to be specified as "cl#".
So it's not a bug, per se, more that changelog user names aren't fully implemented. I'll look into what it would take to allow changelog user names to be used anywhere cl# forms are already accepted.