Firstly, the new RPC will be blocked by our current recovery mechanism automatically, in spite of it is for LFSCK (via OUT RPC) or for other normal operations.
The second is about when the local LFSCK should be auto resumed when the server restart/remount: before the recovery finished or not. There are two cases as following:
1) OSS restart/remount. The local LFSCK on OSS is mainly for rebuilding LAST_ID files.
1.1) If before the OSS restart/remount, we have already known that some LAST_ID files crashed and should be rebuilt, then should we allow the recovery to re-create something based on the wrong LAST_ID or not? If yes, it may cause more damage, right?
1.2) If before the OSS restart/remount, there was no crashed LAST_ID files found during the LFSCK scanning. Then after the OSS restart/remount, as long as the LFSCK will not be misguided by the re-created objects during the recovery, then it is well for the LFSCK and recovery run in parallel.
2) MDS restart/remount. As explained former, the only case for the LFSCK to create MDT-object is during handle orphan OST-object. But such MDT-object will not be in the recovery queue.
I agree with you that the above are our current known situations, there may be more in the future. But we can try to resolve the new cases when they are appear. The worst case is that the new cases are very difficult to be handled, then we can consider to pause the LFSCK at that time. But before such cases happen, I do not think there are some strong reasons to do that now.
The patch has been landed to master.