[LUDOC-180] documentation for the MDS sync_permission /proc tunable Created: 04/Sep/13  Updated: 07/Nov/18

Status: Open
Project: Lustre Documentation
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Andreas Dilger Assignee: Lustre Manual Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by LU-3671 why are permission changes synchronous? Resolved
Related
Severity: 3
Rank (Obsolete): 10069

 Description   

The sync_permission tunable relates to Version Based Recovery (VBR), which is a mechanism to allow some clients to recover from an MDS failure even in the (not so uncommon) case that one or more other active clients fail at the same time as the MDS and do not reconnect to the MDS during its recovery window. An example of version-based recovery in the following situation:

# umask is 022, so dir1 is created with rwxr-xr-x permission
user1@client1$ mkdir /mnt/lustre/dir1
user2@client2$ mkdir /mnt/lustre/otherdir
user1@client1$ chmod go-rwx /mnt/lustre/dir1
user1@client3$ touch /mnt/lustre/dir1/secretfile3
user1@client4$ touch /mnt/lustre/dir1/secretfile4
:
:

If client2 fails at the same time as the MDS after touch is run, with dir1 committed to the MDT disk but chmod only in cache and does not participate during MDS recovery, non-VBR recovery would prevent client[34] from creating secretfile[34] because there would be a gap in the MDT transaction sequence, even though there is no dependency between these files and otherdir. Similarly, if client1 failed concurrently with the MDS before the chmod was committed, then secretfile2 and secretfile3 would not be able to recover, even if the dir1 creation was committed on the MDT before if crashed.

With VBR, the replay for secretfile3 and secretfile4 would be dependent on the version of dir1 (transaction number in which dir1 was created/last modified), and not on each other. If the MDS and client2 fail concurrently, none of the operations that client2 did on the MDT affect dir1, so the other files can be recreated during from the remaining clients, and only files created by the failing client2 node would be lost.

The sync_permission flag is concerned with avoiding the case where client1 fails after creating dir1 and running chmod, but the MDS only committed the mkdir and not the chmod before it failed. That would potentially allow files created in dir1 after the MDS has recovered to succeed in a directory that does not have the correct permissions. With a local filesystem, the failure of the (only) node is obvious to the user. With a distributed filesystem it is possible that the node fails out of sight of the user.

If sync_permission is enabled (the default) then reduction of the permission or changes of ownership of a directory will be synchronous operations. If sync_permission is disabled, then MDS permission changes are asynchronous, which is equivalent to the behaviour of a local filesystem.


Generated at Sat Feb 10 03:40:52 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.