Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
None
-
None
-
9223372036854775807
Description
The layout manipulations required to bring an FLR file in to sync (READONLY in FLR parlance) also give SOM while the file is in sync. This is true SOM, with no caveats, able to be used for any purpose (as distinct from lazy SOM which can only be used by tools which are aware of it).
In essence, there is no reason the SOM portion has to be associated with a replica. Exactly the same functionality can be used just for SOM.
Because the layout state transitions for FLR require synchronous writes to the MDS each time, and because a write to the file destroys the SOM state, this is too expensive to try to use all the time. Instead, the proposal is to set it on all files a certain amount of time after they have been modified (e.g. 24h). If there are no writes to a file for a time, and the client is returning identical size+blocks in the LSOM state at close time, we take it through the layout transitions to mark it LCM_FL_RDONLY (does not make it not writeable, just indicates the attributes are not being modified), and then it has SOM.
This would be a fairly low effort way to allow all files except those being actively modified to have true SOM and improve performance for normal stat() and similar calls.
I think you are referring to the transition from SOM READONLY state when the file is being written.
My comment was about the (IMHO more important) transition from an existing file to SOM READONLY state. This would potentially happen in batches when pre-existing files are accessed after this feature is deployed. We can't have read-only file access suddenly triggering a mass of sync MDT writes.
The percentage of files being written after being idle for over 24h old is vanishingly small, so I think it is a statistics game - rare chance of more overhead (sync MDT write to clear READONLY flag) vs. common case of stat() avoiding 1 or 20 extra OST RPCs to fetch size, blocks, and timestamps.