Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
10069
Description
The sync_permission tunable relates to Version Based Recovery (VBR), which is a mechanism to allow some clients to recover from an MDS failure even in the (not so uncommon) case that one or more other active clients fail at the same time as the MDS and do not reconnect to the MDS during its recovery window. An example of version-based recovery in the following situation:
# umask is 022, so dir1 is created with rwxr-xr-x permission user1@client1$ mkdir /mnt/lustre/dir1 user2@client2$ mkdir /mnt/lustre/otherdir user1@client1$ chmod go-rwx /mnt/lustre/dir1 user1@client3$ touch /mnt/lustre/dir1/secretfile3 user1@client4$ touch /mnt/lustre/dir1/secretfile4 : :
If client2 fails at the same time as the MDS after touch is run, with dir1 committed to the MDT disk but chmod only in cache and does not participate during MDS recovery, non-VBR recovery would prevent client[34] from creating secretfile[34] because there would be a gap in the MDT transaction sequence, even though there is no dependency between these files and otherdir. Similarly, if client1 failed concurrently with the MDS before the chmod was committed, then secretfile2 and secretfile3 would not be able to recover, even if the dir1 creation was committed on the MDT before if crashed.
With VBR, the replay for secretfile3 and secretfile4 would be dependent on the version of dir1 (transaction number in which dir1 was created/last modified), and not on each other. If the MDS and client2 fail concurrently, none of the operations that client2 did on the MDT affect dir1, so the other files can be recreated during from the remaining clients, and only files created by the failing client2 node would be lost.
The sync_permission flag is concerned with avoiding the case where client1 fails after creating dir1 and running chmod, but the MDS only committed the mkdir and not the chmod before it failed. That would potentially allow files created in dir1 after the MDS has recovered to succeed in a directory that does not have the correct permissions. With a local filesystem, the failure of the (only) node is obvious to the user. With a distributed filesystem it is possible that the node fails out of sight of the user.
If sync_permission is enabled (the default) then reduction of the permission or changes of ownership of a directory will be synchronous operations. If sync_permission is disabled, then MDS permission changes are asynchronous, which is equivalent to the behaviour of a local filesystem.
Attachments
Issue Links
- is duplicated by
-
LU-3671 why are permission changes synchronous?
- Resolved