Details
-
Improvement
-
Resolution: Fixed
-
Minor
-
None
-
Upstream
-
None
-
9223372036854775807
Description
We recently had the case of an MDT whose changelog records were not being processed and cleared as they should have been. We quickly reached a point where the whole catalog was full, and we had little choice but to deregister the changelog reader to resume production.
We used lctl --device lustre-MDT0000 changelog_deregister cl1 for that, and it took 3 days to complete. Considering we only had a single changelog reader registered, and our goal was to simply garbage collect every changelog record, it feels wasteful that we should wait 3 days for something that essentially deletes a few files on the MDT.
Would it be possible to speed up this process?
It would be nice that this works by special-casing lctl changelog_deregister when there is only one reader registered, but I think a new command (eg. lctl changelog_delete_everything, lctl changelog_reset, ...) would also be satisfying.