Ned,
the specific problem that is being avoided here relates to Version Based Recovery (VBR), is a mechanism to allow some clients to recover from an MDS failure even in the (not so uncommon) case that one or more other active clients do not reconnect MDS and do their own recovery. Basic version-based recovery in the following situation:
If client2 fails at the same time as the MDS (right after dir1 is created) and does not participate during MDS recovery, old Lustre recovery would prevent client[34] from creating file[34] because there would be a gap in the MDS transaction sequence, even though there is no dependency between these files and dir2. Similarly, if client1 failed, then file2 and file3 would not be able to recover, even if the dir1 creation was committed on the MDT before if crashed.
With VBR, the replay for file3 and file4 would be dependent on the version of dir1 (transaction number in which dir1 was created/last modified), and not on each other. That would allow the files to be recreated from any running client, and only files created by the failing node would be lost.
The sync_permission flag is concerned with avoiding the case where client1 fails after creating dir1 and running chmod, but the MDS only committed the mkdir and not the chmod before it fails. That would potentially allow the file creations to be replayed in a directory that does not have the correct permissions.
Mike, thinking about this further, is the version of dir1 changed by the chmod so that the later file creates are dependent upon the new version of dir1 and not the old one? That would also prevent the later files to be created without any sync at all, though in most cases where permission changes are not being done this would increase the number of unreplayable RPCs in case of MDS failure. Could you please further clarify what specific problem the sync_permission behaviour is avoiding?
I filed LUDOC-180 to track the documentation for this /proc tunable, and this one can be closed since the patch to avoid sync operations for regular files and non-permission setattrs has landed for 2.5.0.
It could potentially also be cherry-picked for 2.4.x and 2.1.x.