As I mentioned previously, there is no need to disable the xattr cache for trusted.som completely. Virtually all files in the filesystem will not be recently modified, so disabling the cache just means more overhead for the common case where the trusted.som xattr is valid. Also, the current implementation of LSOM is mainly intended to be used from the MDS by scanning the MDT disk filesystem directly, not to be used on the client, so adding overhead on the client to manage what is supposed to be "lazy" doesn't make sense.
Even if the trusted.som xattr is fetched repeatedly, there is still a chance that it is out-of-date (e.g. file has not been closed, llsom_sync is not running, or MDS is an older version without LSOM), so we can't ever rely on it to be completely accurate, and shouldn't try too hard to make it so. The file size will normally be accurate shortly after close, it is only the blocks count which may be inaccurate. If the application depends on the blocks count to be relatively accurate, then the userspace application should check something like:
struct lustre_som_attrs lsom;
rc = lgetxattr(file, "trusted.som", &lsom, sizeof(lsom));
size = lsom.lsa_size;
blocks = lsom.lsa_blocks;
if (rc < sizeof(lsom) || size > 1048576 && lsom.lsa_blocks == 0 && lsom.lsa_valid != SOM_FL_STRICT) {
fd = open(file, O_RDONLY);
if (fd >= 0) {
rc = fstat(fd, st);
size = st.st_size;
blocks = st.st_blocks;
close(fd);
}
}
Two lighter-weight solutions exist that will still allow accurate trusted.som xattrs to be fetched to the client:
- cancel the mdc DLM locks via lctl set_param ldlm.namespaces.mdc.lru_size=clear to discard all locks. This is enough for testing that LSOM works, which is (IMHO) mostly what the "lfs getsom" interface is for. I don't think that applications will be using "lfs getsom" or the trusted.som xattr directly, and it isn't needed for the lfs find or statx() interface in the future. Note that trusted.som is only accessible to the root user, so this makes it not very useful for regular users, and a much better reason to implement statx() for regular users instead of trying to guess.
- implement a heuristic on the client to re-fetch the trusted.som xattr if the mtime/ctime of the inode is within the past e.g. 15 minutes (longer than what llsom_sync needs to update trusted.som, but use it from cache if the inode is older than this. The MDS will already keep the inode up-to-date for the mtime/ctime (which is needed for POSIX), and if llsom_sync hasn't update trusted.som in this time, then it is likely that it is not running and LSOM will not be more uptodate than what the client has. This will allow 99.999999% of (not-just-created) files to have the trusted.som xattr fetched on lookup and cached on the client, rather than making each access generate another RPC (which could as well get the accurate file size directly from the OSTs).
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49952/
Subject:
LU-11695som: disabling xattr cache for LSOM on clientProject: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: dcd842e870a099e18a9afcb817f382648c96bed0