Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • None
    • Lustre 2.8.0
    • None
    • 3
    • 9223372036854775807

    Description

      I was asked to file a ticket and summarize defects in the landed implementation of SELinux support which had been reported prior to landing:

      1) [performance] create and setxattr are separate RPCs which makes things slow
      2) [recovery] create and setxattr are separate RPCs, if the client crashes in between of create and setxattr, the file will not get a security label; client kernels will use default security labels in this case, but default label concept is not to work around file system bugs and no default label will be consistent with SELinux security model
      3) [atomicity] create and setxattr are separate RPCs, if another client accesses the same file, it will see no security label, it will raise the same issues as in (2)
      4) [consistent file system view from different clients] although initial versions of the landed patch made attempts to synchronize relabel operations among clients, the final patch does not implement any synchronization, so inodes in memory will keep old security labels in inode->i_security (this is documented in LU-5560)

      Attachments

        Issue Links

          Activity

            [LU-6784] Defects in SELinux support
            green Oleg Drokin added a comment -

            Chris, I don't entirely agree with you about that my argument is weak.

            (opensource) Lustre is a kit, but Andoird AOSP project as released by Google is also a kit. You do some legork to actually make that work on your device.
            Or you can buy a premade device with the OS already tuned for it (from Google or whomever). And you can buy a Lustre appliance with the loose bits tied too.

            Otherwise I feel we'll end up claiming that there's no failover in Lustre because you need to read documentation about how to do it and if you do it improperly - your filesystem would be damaged (doublemount and stuff). And perhaps other examples of the same (HSM is probably magnitudes more nuanced).

            If you ask me, I feel that single client node SELinux support is in itself of a very limited use, it's like that lock that only keeps the honest people honest. Mount unauthorised or not-SELinux enabled or misconfigured client and the security is no more.
            To have a robust solution we need to at least:
            Enforce same policies on the servers (+protocol changes to actually transfer contexts around for every RPC - it's possible that existing bits there would somewhat work)
            Authenticate clients and assign different roles (in SELinux policy speak) to different classes of clients (also some sort of kernel code signing?)
            Disallow (or put in a separate role?) non-SELinux enabled clients.
            Either replicate file contexts on OSTs (inflexible) or do access validation via MDS (more expensive) to avoid bad clients coming straight to OSTs and trying to read objects directly.
            Do server validation (to ensure nobody registers phoney servers).

            There are probably other must-haves, that I cannot think off the top of my head.

            green Oleg Drokin added a comment - Chris, I don't entirely agree with you about that my argument is weak. (opensource) Lustre is a kit, but Andoird AOSP project as released by Google is also a kit. You do some legork to actually make that work on your device. Or you can buy a premade device with the OS already tuned for it (from Google or whomever). And you can buy a Lustre appliance with the loose bits tied too. Otherwise I feel we'll end up claiming that there's no failover in Lustre because you need to read documentation about how to do it and if you do it improperly - your filesystem would be damaged (doublemount and stuff). And perhaps other examples of the same (HSM is probably magnitudes more nuanced). If you ask me, I feel that single client node SELinux support is in itself of a very limited use, it's like that lock that only keeps the honest people honest. Mount unauthorised or not-SELinux enabled or misconfigured client and the security is no more. To have a robust solution we need to at least: Enforce same policies on the servers (+protocol changes to actually transfer contexts around for every RPC - it's possible that existing bits there would somewhat work) Authenticate clients and assign different roles (in SELinux policy speak) to different classes of clients (also some sort of kernel code signing?) Disallow (or put in a separate role?) non-SELinux enabled clients. Either replicate file contexts on OSTs (inflexible) or do access validation via MDS (more expensive) to avoid bad clients coming straight to OSTs and trying to read objects directly. Do server validation (to ensure nobody registers phoney servers). There are probably other must-haves, that I cannot think off the top of my head.

            Hi,

            Enabling SELinux on a single node is in itself a source of complication for administrators: things that were just working out of the box without SELinux simply do not do anymore, and require fine security tuning to be fully functional again. Then one can imagine that enabling SELinux on a whole cluster necessarily implies administrative headaches. So I think we have to make clear that sites enabling SELinux on Lustre client nodes will inevitably see an increase in their support work. SELinux is complex and may require dedicated security administrators to maintain clusters in operational security conditions.
            With appropriate documentation, the current implementation of SELinux support on Lustre client node could be considered as a tech preview with some restrictions.

            This ticket could be the occasion to discuss the following technical points:
            a) [atomicity]
            b) [consistent file system view from different clients]
            I consider [performance] and [recovery] will be solved when [atomicity] is addressed.

            a) [atomicity]
            In gerrit (http://review.whamcloud.com/11648), my proposition to address atomicity is to use a new security primitive (available from rhel7) named security_dentry_init_security(). It could be called from ll_lookup_it() as it only needs a dentry. Then security information could be sent with the lookup request (presumably creating a new request type and/or making a protocol change) so that the MDT can set the xattr in the same transaction as the file creation.
            This proposition is only valid for rhel7, so far I cannot see an equivalent for rhel6.

            b) [consistent file system view from different clients]
            Coherency could be addressed thanks to a lock: setting or updating a security context would require a RW lock, and accessing a file would require a RO lock. What is unclear to me is the following: can we use for that an already existing lock type, or should we create a new lock type? Moreover, is there a description available somewhere about how to take and release locks in Lustre? or maybe some intelligible parts of the code that could be taken as an example to implement what we need here?

            Of course, any technical remark or proposition would be much appreciated.
            Thanks,
            Sebastien.

            sebastien.buisson Sebastien Buisson (Inactive) added a comment - Hi, Enabling SELinux on a single node is in itself a source of complication for administrators: things that were just working out of the box without SELinux simply do not do anymore, and require fine security tuning to be fully functional again. Then one can imagine that enabling SELinux on a whole cluster necessarily implies administrative headaches. So I think we have to make clear that sites enabling SELinux on Lustre client nodes will inevitably see an increase in their support work. SELinux is complex and may require dedicated security administrators to maintain clusters in operational security conditions. With appropriate documentation, the current implementation of SELinux support on Lustre client node could be considered as a tech preview with some restrictions. This ticket could be the occasion to discuss the following technical points: a) [atomicity] b) [consistent file system view from different clients] I consider [performance] and [recovery] will be solved when [atomicity] is addressed. a) [atomicity] In gerrit ( http://review.whamcloud.com/11648 ), my proposition to address atomicity is to use a new security primitive (available from rhel7) named security_dentry_init_security(). It could be called from ll_lookup_it() as it only needs a dentry. Then security information could be sent with the lookup request (presumably creating a new request type and/or making a protocol change) so that the MDT can set the xattr in the same transaction as the file creation. This proposition is only valid for rhel7, so far I cannot see an equivalent for rhel6. b) [consistent file system view from different clients] Coherency could be addressed thanks to a lock: setting or updating a security context would require a RW lock, and accessing a file would require a RO lock. What is unclear to me is the following: can we use for that an already existing lock type, or should we create a new lock type? Moreover, is there a description available somewhere about how to take and release locks in Lustre? or maybe some intelligible parts of the code that could be taken as an example to implement what we need here? Of course, any technical remark or proposition would be much appreciated. Thanks, Sebastien.

            I would like to try to now steer this conversation back to a discussion of technical merits.

            I readily admit that I a may not have a deep enough understanding of the details yet to fully form a position. Nevertheless, Andrew's original point in the description section of this ticket seems valid to me, and worth more discussion.

            Oleg's counter arguments that Android does the same thing and we can resolve the issue with documentation seem a bit weak to me.

            First, Android distros control the entire OS environment. Lustre does not. Lustre only controls Lustre.

            This design appears to be insecure by default, and it is up to the many system administrators to plug the hole in the design. The documentation of that fact will no doubt be burried in the giant operations manual, and I think it will be very common for system administrators to miss that requirement.

            Even if all system administrators somehow manage to correctly configure their systems to be secure, any time this problem occurs it sounds like we will be left with files/directories that can only be cleaned up though administrative action. After faults, users will be left with files/directories that they cannot remove. This will generate more support tickets. The system administrators will probably not remember that the unremovable files are due to SELinux, so the problem will escalate to higher levels of support: local software developers or support vendors.

            Our design should not add more work for support staff. Our design should not have defaults that are insecure and rely on human intervention to fix that.

            I am still open to being convinced otherwise, but so far I am tending to agree that it was not yet ready for landing. This should probably be worked on a topic branch and reach a further level of completeness before merging into master.

            morrone Christopher Morrone (Inactive) added a comment - I would like to try to now steer this conversation back to a discussion of technical merits. I readily admit that I a may not have a deep enough understanding of the details yet to fully form a position. Nevertheless, Andrew's original point in the description section of this ticket seems valid to me, and worth more discussion. Oleg's counter arguments that Android does the same thing and we can resolve the issue with documentation seem a bit weak to me. First, Android distros control the entire OS environment. Lustre does not. Lustre only controls Lustre. This design appears to be insecure by default, and it is up to the many system administrators to plug the hole in the design. The documentation of that fact will no doubt be burried in the giant operations manual, and I think it will be very common for system administrators to miss that requirement. Even if all system administrators somehow manage to correctly configure their systems to be secure, any time this problem occurs it sounds like we will be left with files/directories that can only be cleaned up though administrative action. After faults, users will be left with files/directories that they cannot remove. This will generate more support tickets. The system administrators will probably not remember that the unremovable files are due to SELinux, so the problem will escalate to higher levels of support: local software developers or support vendors. Our design should not add more work for support staff. Our design should not have defaults that are insecure and rely on human intervention to fix that. I am still open to being convinced otherwise, but so far I am tending to agree that it was not yet ready for landing. This should probably be worked on a topic branch and reach a further level of completeness before merging into master.

            Intel blocked new connection flag landing when there was discussion on porting SDA code upstream in later development phases.

            That is not accurate. On the contrary, Intel employees gave the patch positive reviews and were ready to land it. I was the person who blocked that patch. I am an employee of LLNL and a member of the leadership within OpenSFS.

            morrone Christopher Morrone (Inactive) added a comment - Intel blocked new connection flag landing when there was discussion on porting SDA code upstream in later development phases. That is not accurate. On the contrary, Intel employees gave the patch positive reviews and were ready to land it. I was the person who blocked that patch. I am an employee of LLNL and a member of the leadership within OpenSFS.

            We expect a fair play if you are employed by OpenSFS to work as GateKepeer and do development for OpenSFS, but i don't see it for long time.

            shadow Alexey Lyashkov added a comment - We expect a fair play if you are employed by OpenSFS to work as GateKepeer and do development for OpenSFS, but i don't see it for long time.

            Andreas,

            may i remember you many my patches which you forbid to land because it isn't fully finished?
            Why you land bad patched which needs for Intel and say "we are free to contribute fixes" and don't like to land other?
            if feature isn't ready and produce a regressions - it feature should be forbid to land.. isn't ?

            shadow Alexey Lyashkov added a comment - Andreas, may i remember you many my patches which you forbid to land because it isn't fully finished? Why you land bad patched which needs for Intel and say "we are free to contribute fixes" and don't like to land other? if feature isn't ready and produce a regressions - it feature should be forbid to land.. isn't ?
            panda Andrew Perepechko added a comment - - edited

            I don't see how SELinux support in SDA, if there is any, is related to this ticket. Intel blocked new connection flag landing when there was discussion on porting SDA code upstream in later development phases. The patch for the connection flag is still in gerrit, I believe. So, it's clearly Intel who broke compatibility, if there IS any SELinux support in SDA. (I'm not part of the SDA team and not aware of the current status of the development)

            The reference to time and resources is weird. All defects listed above are known since September 2014, for 10 months already (issue 4 was described by the author in detail, others were described in gerrit). In order to help improve the code, I shared our experience from design phase in gerrit comments, though I wasn't formally asked to review the patch and had no formal obligation to comment it. I believe, only my comment about setstripe creation path was taken into account.

            This ticket is not about preventing moving SELinux support forward. It just states what needs to be addressed to move forward and make the feature usable.

            Actually, quite insignificant defects many times were the reason of rejecting a patch at the Intel side. However, this time there is tolerance and optimistic belief that there will be some movement forward, that we are preventing by just speaking of defects, after 10 months of not moving forward. Double standard?

            panda Andrew Perepechko added a comment - - edited I don't see how SELinux support in SDA, if there is any, is related to this ticket. Intel blocked new connection flag landing when there was discussion on porting SDA code upstream in later development phases. The patch for the connection flag is still in gerrit, I believe. So, it's clearly Intel who broke compatibility, if there IS any SELinux support in SDA. (I'm not part of the SDA team and not aware of the current status of the development) The reference to time and resources is weird. All defects listed above are known since September 2014, for 10 months already (issue 4 was described by the author in detail, others were described in gerrit). In order to help improve the code, I shared our experience from design phase in gerrit comments, though I wasn't formally asked to review the patch and had no formal obligation to comment it. I believe, only my comment about setstripe creation path was taken into account. This ticket is not about preventing moving SELinux support forward. It just states what needs to be addressed to move forward and make the feature usable. Actually, quite insignificant defects many times were the reason of rejecting a patch at the Intel side. However, this time there is tolerance and optimistic belief that there will be some movement forward, that we are preventing by just speaking of defects, after 10 months of not moving forward. Double standard?

            My understanding is that the atomic setting of the SELinux xattr is something that Seagate/Xyratex has already implemented for SDA feature LU-5074, and we would welcome your contribution of those patches to the GPL LICENSED CODE THAT YOU ARE SHIPPING TO CUSTOMERS.

            In the absence of those patches, we are moving forward with the SELinux functionality as time and resources permit, and it will very likely be incompatible with whatever you have currently implemented. If you aren't willing to contribute code that you already have to fix this, then please don't just provide objections to prevent it moving forward independently.

            adilger Andreas Dilger added a comment - My understanding is that the atomic setting of the SELinux xattr is something that Seagate/Xyratex has already implemented for SDA feature LU-5074 , and we would welcome your contribution of those patches to the GPL LICENSED CODE THAT YOU ARE SHIPPING TO CUSTOMERS. In the absence of those patches, we are moving forward with the SELinux functionality as time and resources permit, and it will very likely be incompatible with whatever you have currently implemented. If you aren't willing to contribute code that you already have to fix this, then please don't just provide objections to prevent it moving forward independently.
            green Oleg Drokin added a comment -

            It should be noted that the items #2 and #3 are not new and a very similar issue is present in Android implementation.
            The way they address it there is to have default context for all unlabelled files to be super restrictive such that nothing can access them. Then we allow the creator to change that to something else and make it actually accessible.
            That way half-done objects are at least safe from other people accessing them.
            This would need to be explained in documentation and all that of course.

            green Oleg Drokin added a comment - It should be noted that the items #2 and #3 are not new and a very similar issue is present in Android implementation. The way they address it there is to have default context for all unlabelled files to be super restrictive such that nothing can access them. Then we allow the creator to change that to something else and make it actually accessible. That way half-done objects are at least safe from other people accessing them. This would need to be explained in documentation and all that of course.

            2 points:
            1) the review revealed the recovery defects which were not resolved before the land; we should try to avoid such a practice;
            2) the patch suggests to use a not reliable security (in the recovery case); however, this is not about some attributes, this is about security, who wants to use a not reliable security?
            => should we probably revert a patch or document somewhere “the selinux support is not reliable, will be fixed in 2.9” would be enough ?

            vitaly_fertman Vitaly Fertman added a comment - 2 points: 1) the review revealed the recovery defects which were not resolved before the land; we should try to avoid such a practice; 2) the patch suggests to use a not reliable security (in the recovery case); however, this is not about some attributes, this is about security , who wants to use a not reliable security? => should we probably revert a patch or document somewhere “the selinux support is not reliable, will be fixed in 2.9” would be enough ?

            Also, though not a terrible defect, it could be beneficial to avoid prepping dentry and inode in ll_dir_setstripe() if SELinux is disabled.

            panda Andrew Perepechko added a comment - Also, though not a terrible defect, it could be beneficial to avoid prepping dentry and inode in ll_dir_setstripe() if SELinux is disabled.

            People

              wc-triage WC Triage
              panda Andrew Perepechko
              Votes:
              0 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: