It isn't clear that having a per-target version number would help. The target being added (OST) can provide a version number, but the client shouldn't have to pass its "connect/config version" for every OST it knows about to the MDS for every file it is creating.
Sending a single "config record version" (really just the last MGS config record number that the client processed) from the client to the MDS with each create would be more useful, since this would (indirectly) tell the MDS which OSTs the client is connected to and it could skip ones that were added after that version. Something like storing a "minimum config record number" on each target in LOD which is the config llog record in which the OST was added, and the client request would include their "current config record number" along with each request. A check like:
if (req->current_config_rec < lod->target[ost_idx]->tgt_min_config_rec)
continue;
during create would be enough to skip the OST for that create.
HOWEVER this has some significant drawbacks:
- it needs a protocol change so that clients will always send this field with each create request and the MDS will check it, to fix a problem that happens very rarely for most users
- the config llog record numbers may not be easily accessible in the right parts of the code (haven't looked at that yet)
- it would only fix the problem of one client creating and using a file itself, but would not fix the problem of a different client trying to access that file before it had processed the config llog updates (which may be delayed tens of seconds if there are many clients)
So, my suggestion for a simpler solution is just to use a timeout (maybe variable), based on how quickly config llog records are processed by clients, before an OST (or MDT, for DNE) can be used for new file allocations. There is already a small delay before the OST could be used, because the MDS needs to precreate objects there, but that is only a fraction of a second in most cases. Instead of storing the "config version" in the LOD target, store the "config time" for the target, and skip it for new allocations for e.g. 10s after it connects. This should handle the case where the MDS itself was just mounted and all OSTs are pre-existing (e.g. only delay usage if the lov_objids entry was just added).
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36872/
Subject:
LU-12025osp: allow OS_STATE_* flags from OSTsProject: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: b0194200146a54ee45df208da88dcc6b916fb51f