Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.14.0, Lustre 2.16.0
-
3
-
9223372036854775807
Description
It would be useful to allow individual applications to run with "localflock" or "flock" independent of what the client filesystem mount option is. This avoids several issues:
- in clusters with different applications running, there may not be a single correct option for the flock mode that satisfies everyone. This can be worked around by having two mountpoints (one with flock and one with localflock, but that is inconvenient, and doubles the number of "clients" that each server has to manage and adds complexity for admins and users.
- it isn't currently possible to change the flock mode of a mountpoint without unmounting and remounting it (LU-8069). Changing the flock mode on the fly might be possible, but could have a high complexity of there are already flocks in use by some application.
One option would be for the flock mode to be persistent on a file/directory as part of the file layout, and set with something like "lfs setstripe component_set --comp-flags" or similar. The flock mode should probably apply to the whole file, not just a single component, as that could otherwise get messy (or would that be a useful feature? I don't know). The persistent flock flags would need to be inherited for new files/subdirectories in a tree, since the actual files being locked may not exist in advance. Having persistent flock flags has the benefit that it is a "fire and forget" mechanism to mark some directory tree as needing a specific locking mode, and can be done by the user or admin once and does not need changes for ongoing application launching. However, it means that all applications accessing those files must use the same flock mode, even on other clients (for better or worse), and it needs to be run on any output directories used by that application in each filesystem.
Another option would be for the flock mode to be local to the client node that and is set via a call like "lfs ladvise". This would need to be run in the job preamble script on every node where a job is launched. Similar to (or as part of) the jobid_this_session mechanism that stores the jobid_var for the process session in a hash table, it would be possible to run a command like "lfs ladvise flock [file|dir ...]" or "lfs ladvise localflock [file|dir ...]" and have the flock mode apply for the file or directory (and new files created or opened therein), or for the whole process session if no files/directories are specified. This mode would only be local to the current login session and files opened by it, and has the benefit that different applications opening the same files could have different flock modes. However, it would need to be run every time that the application is run (or as part of the application itself via llapi_ladvise()), which is potentially error prone.