Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • None
    • Lustre 2.8.0
    • None
    • 3
    • 9223372036854775807

    Description

      I was asked to file a ticket and summarize defects in the landed implementation of SELinux support which had been reported prior to landing:

      1) [performance] create and setxattr are separate RPCs which makes things slow
      2) [recovery] create and setxattr are separate RPCs, if the client crashes in between of create and setxattr, the file will not get a security label; client kernels will use default security labels in this case, but default label concept is not to work around file system bugs and no default label will be consistent with SELinux security model
      3) [atomicity] create and setxattr are separate RPCs, if another client accesses the same file, it will see no security label, it will raise the same issues as in (2)
      4) [consistent file system view from different clients] although initial versions of the landed patch made attempts to synchronize relabel operations among clients, the final patch does not implement any synchronization, so inodes in memory will keep old security labels in inode->i_security (this is documented in LU-5560)

      Attachments

        Issue Links

          Activity

            [LU-6784] Defects in SELinux support

            These issues have been resolved in other tickets.

            adilger Andreas Dilger added a comment - These issues have been resolved in other tickets.

            Hi there,

            Actually, this is kind of disadvantage of the approach. On the contrary, having policies applied on the MDS avoids any possible policy desync.

            Do you mean that in the solution with policies applied on the MDS, SELinux is not enabled on the Lustre clients? In that case, how can the security policy be applied properly, without knowing the security context of the binary doing the action on Lustre?
            Moreover, I would tend to think that security administrators that want SELinux for Lustre also want it for the root filesystem on their cluster's nodes. In other words, the customer approach is rather the following: now that SELinux is enabled on the cluster's nodes, how does it work when these nodes access Lustre?
            In that case, having SELinux enforced on the MDS (requiring an up-to-date policy equivalent to the one on the clients) would represent an additional burden for security administrators.

            It is unclear to me, though, what benefit is actually granted in your example versus standard posix file permissions.

            The example of sshd and authorized keys file explains what is going on when a user connects via ssh to a node where SELinux is enabled. This is about making it work properly on Lustre when SELinux is enforced, hence getting the benefits of SELinux.
            By definition, SELinux brings mandatory access-control policies that confine user programs and system servers access to files and network resources. Confinement is stronger than standard posix file permissions.
            Of course, everyone willing to play with SELinux might be aware of what it is

            sebastien.buisson Sebastien Buisson (Inactive) added a comment - Hi there, Actually, this is kind of disadvantage of the approach. On the contrary, having policies applied on the MDS avoids any possible policy desync. Do you mean that in the solution with policies applied on the MDS, SELinux is not enabled on the Lustre clients? In that case, how can the security policy be applied properly, without knowing the security context of the binary doing the action on Lustre? Moreover, I would tend to think that security administrators that want SELinux for Lustre also want it for the root filesystem on their cluster's nodes. In other words, the customer approach is rather the following: now that SELinux is enabled on the cluster's nodes, how does it work when these nodes access Lustre? In that case, having SELinux enforced on the MDS (requiring an up-to-date policy equivalent to the one on the clients) would represent an additional burden for security administrators. It is unclear to me, though, what benefit is actually granted in your example versus standard posix file permissions. The example of sshd and authorized keys file explains what is going on when a user connects via ssh to a node where SELinux is enabled. This is about making it work properly on Lustre when SELinux is enforced, hence getting the benefits of SELinux. By definition, SELinux brings mandatory access-control policies that confine user programs and system servers access to files and network resources. Confinement is stronger than standard posix file permissions. Of course, everyone willing to play with SELinux might be aware of what it is

            Sebastien, thank you for your example from July 10th.

            It is unclear to me, though, what benefit is actually granted in your example versus standard posix file permissions. It sounds to me like this implements the SELinux API on the client, but fails to provide any actual security benefit. This sounds very dangerous. I suspect that most users and system administrators will simply hear that SELinux is now available in Lustre, and fail to recognize that it isn't really providing more security. That could put a lot of people at risk of failing security audits.

            I have to agree with Andrew too, that the current design is a disadvantage, not a benefit. You are shifting security burden from the software onto human beings to correctly and uniformly configure all possible client nodes. Humans are error-prone. It also makes security auditing more difficult. Configuring the security policies once on the central server, if that is at all reasonable to do in software, sounds like a better approach.

            morrone Christopher Morrone (Inactive) added a comment - Sebastien, thank you for your example from July 10th. It is unclear to me, though, what benefit is actually granted in your example versus standard posix file permissions. It sounds to me like this implements the SELinux API on the client, but fails to provide any actual security benefit. This sounds very dangerous. I suspect that most users and system administrators will simply hear that SELinux is now available in Lustre, and fail to recognize that it isn't really providing more security. That could put a lot of people at risk of failing security audits. I have to agree with Andrew too, that the current design is a disadvantage, not a benefit. You are shifting security burden from the software onto human beings to correctly and uniformly configure all possible client nodes. Humans are error-prone. It also makes security auditing more difficult. Configuring the security policies once on the central server, if that is at all reasonable to do in software, sounds like a better approach.

            the cluster is simpler to administrate from a security point of view: all it needs is to have the same exact SELinux policy on every single Lustre client node.

            Actually, this is kind of disadvantage of the approach. On the contrary, having policies applied on the MDS avoids any possible policy desync.

            panda Andrew Perepechko added a comment - the cluster is simpler to administrate from a security point of view: all it needs is to have the same exact SELinux policy on every single Lustre client node. Actually, this is kind of disadvantage of the approach. On the contrary, having policies applied on the MDS avoids any possible policy desync.

            > What the current patch does is to properly initiate the security context of a file created on an SELinux-enabled client, and store this security context on the server side (MDT) via the security.selinux extended attribute.

            Will don't work without selinux policy changes as Selinux use an xattr only for filesystems specially marked.

            > Consider the home directories of the cluster's users. Every user has a .ssh directory in his/her home, containing a file with the list of authorized keys. When a remote connection is incoming, the sshd daemon checks the contents of this file to see if the incoming ssh key matches one referenced in the file.

            Looks you forget one small note. if you enable an Selinux - it will affect to any files, not only ssh home where a static label exist.
            so when any new file created it need assign a new label.

            > On the opposite, setting security context on the server side would require to also have SELinux enforced on the MDS.

            will not work, as selinux hooks called from VFS functions, but MDT/OST don't have such calls inside.

            > Protecting Lustre file systems from being mounted by SELinux-disabled clients is not SELinux' responsibility. Mechanisms like Kerberos or Shared Keys can address these problematics very well.

            It will done if you change a protocol correctly. Other pointed solutions not an address it problem, as knowledge about is client support selinux or not, just hide inside lustre and don't export to user land.

            You forget about problem when different clients have a different Selinux polices ...

            shadow Alexey Lyashkov added a comment - > What the current patch does is to properly initiate the security context of a file created on an SELinux-enabled client, and store this security context on the server side (MDT) via the security.selinux extended attribute. Will don't work without selinux policy changes as Selinux use an xattr only for filesystems specially marked. > Consider the home directories of the cluster's users. Every user has a .ssh directory in his/her home, containing a file with the list of authorized keys. When a remote connection is incoming, the sshd daemon checks the contents of this file to see if the incoming ssh key matches one referenced in the file. Looks you forget one small note. if you enable an Selinux - it will affect to any files, not only ssh home where a static label exist. so when any new file created it need assign a new label. > On the opposite, setting security context on the server side would require to also have SELinux enforced on the MDS. will not work, as selinux hooks called from VFS functions, but MDT/OST don't have such calls inside. > Protecting Lustre file systems from being mounted by SELinux-disabled clients is not SELinux' responsibility. Mechanisms like Kerberos or Shared Keys can address these problematics very well. It will done if you change a protocol correctly. Other pointed solutions not an address it problem, as knowledge about is client support selinux or not, just hide inside lustre and don't export to user land. You forget about problem when different clients have a different Selinux polices ...

            What the current patch does is to properly initiate the security context of a file created on an SELinux-enabled client, and store this security context on the server side (MDT) via the security.selinux extended attribute.

            I can give one very simple example to help understand the utility of this.
            Consider the home directories of the cluster's users. Every user has a .ssh directory in his/her home, containing a file with the list of authorized keys. When a remote connection is incoming, the sshd daemon checks the contents of this file to see if the incoming ssh key matches one referenced in the file.
            When SELinux is enforced, the sshd daemon can only access the authorized keys file under ~/.ssh if this file has the ssh_home_t security context. If this file has no security context, or the default_t one, then one of the two following actions will be required in order to get ssh to work:

            • the security administrator will have to relabel the file, so that the file gets the ssh_home_t security context; this action consists in scanning the file system;
            • the security administrator will have to extend the SELinux policy so that sshd daemon is allowed to read a file that has the default_t security context; this action defeats the confinement concept that is the foundation of SELinux.
              Local file systems like ext4 store files' security context on disk so that this information is made persistent.

            Now consider that these home directories are located on a Lustre file system. By storing the files' security context permanently on the MDT, the current patch avoids, like a local file system, to resort to the two admin actions above.

            Setting security context from client side has two major advantages:

            • there is no need for SELinux on the server side: MDT only stores an extended attribute that is provided by the Lustre client;
            • the cluster is simpler to administrate from a security point of view: all it needs is to have the same exact SELinux policy on every single Lustre client node.
              On the opposite, setting security context on the server side would require to also have SELinux enforced on the MDS. And this would require an up-to-date policy equivalent to the one on the clients, but necessarily adapted because of the paths to the files in the MDT that differ from the ones on the clients that mount Lustre.

            Protecting Lustre file systems from being mounted by SELinux-disabled clients is not SELinux' responsibility. Mechanisms like Kerberos or Shared Keys can address these problematics very well.
            It shows that securing a cluster must be thought as a whole.

            sebastien.buisson Sebastien Buisson (Inactive) added a comment - What the current patch does is to properly initiate the security context of a file created on an SELinux-enabled client, and store this security context on the server side (MDT) via the security.selinux extended attribute. I can give one very simple example to help understand the utility of this. Consider the home directories of the cluster's users. Every user has a .ssh directory in his/her home, containing a file with the list of authorized keys. When a remote connection is incoming, the sshd daemon checks the contents of this file to see if the incoming ssh key matches one referenced in the file. When SELinux is enforced, the sshd daemon can only access the authorized keys file under ~/.ssh if this file has the ssh_home_t security context. If this file has no security context, or the default_t one, then one of the two following actions will be required in order to get ssh to work: the security administrator will have to relabel the file, so that the file gets the ssh_home_t security context; this action consists in scanning the file system; the security administrator will have to extend the SELinux policy so that sshd daemon is allowed to read a file that has the default_t security context; this action defeats the confinement concept that is the foundation of SELinux. Local file systems like ext4 store files' security context on disk so that this information is made persistent. Now consider that these home directories are located on a Lustre file system. By storing the files' security context permanently on the MDT, the current patch avoids, like a local file system, to resort to the two admin actions above. Setting security context from client side has two major advantages: there is no need for SELinux on the server side: MDT only stores an extended attribute that is provided by the Lustre client; the cluster is simpler to administrate from a security point of view: all it needs is to have the same exact SELinux policy on every single Lustre client node. On the opposite, setting security context on the server side would require to also have SELinux enforced on the MDS. And this would require an up-to-date policy equivalent to the one on the clients, but necessarily adapted because of the paths to the files in the MDT that differ from the ones on the clients that mount Lustre. Protecting Lustre file systems from being mounted by SELinux-disabled clients is not SELinux' responsibility. Mechanisms like Kerberos or Shared Keys can address these problematics very well. It shows that securing a cluster must be thought as a whole.

            It would probably aid the conversation to enumerate a use case or two. That might help people understand what exactly it is that we are getting with the current patch, and whether it made sense to land it as-is, or it should have waited for further work.

            As it is, I suspect that quite of few people wishing to use the SELinux work may not understand the current patch's scope and limitations.

            morrone Christopher Morrone (Inactive) added a comment - It would probably aid the conversation to enumerate a use case or two. That might help people understand what exactly it is that we are getting with the current patch, and whether it made sense to land it as-is, or it should have waited for further work. As it is, I suspect that quite of few people wishing to use the SELinux work may not understand the current patch's scope and limitations.
            green Oleg Drokin added a comment -

            BTW forgot to add, despite all of the must-haves, even the limited support being added here has some uses as evidenced by people standing behind those patches and planning to use them.

            green Oleg Drokin added a comment - BTW forgot to add, despite all of the must-haves, even the limited support being added here has some uses as evidenced by people standing behind those patches and planning to use them.
            green Oleg Drokin added a comment -

            Chris, I don't entirely agree with you about that my argument is weak.

            (opensource) Lustre is a kit, but Andoird AOSP project as released by Google is also a kit. You do some legork to actually make that work on your device.
            Or you can buy a premade device with the OS already tuned for it (from Google or whomever). And you can buy a Lustre appliance with the loose bits tied too.

            Otherwise I feel we'll end up claiming that there's no failover in Lustre because you need to read documentation about how to do it and if you do it improperly - your filesystem would be damaged (doublemount and stuff). And perhaps other examples of the same (HSM is probably magnitudes more nuanced).

            If you ask me, I feel that single client node SELinux support is in itself of a very limited use, it's like that lock that only keeps the honest people honest. Mount unauthorised or not-SELinux enabled or misconfigured client and the security is no more.
            To have a robust solution we need to at least:
            Enforce same policies on the servers (+protocol changes to actually transfer contexts around for every RPC - it's possible that existing bits there would somewhat work)
            Authenticate clients and assign different roles (in SELinux policy speak) to different classes of clients (also some sort of kernel code signing?)
            Disallow (or put in a separate role?) non-SELinux enabled clients.
            Either replicate file contexts on OSTs (inflexible) or do access validation via MDS (more expensive) to avoid bad clients coming straight to OSTs and trying to read objects directly.
            Do server validation (to ensure nobody registers phoney servers).

            There are probably other must-haves, that I cannot think off the top of my head.

            green Oleg Drokin added a comment - Chris, I don't entirely agree with you about that my argument is weak. (opensource) Lustre is a kit, but Andoird AOSP project as released by Google is also a kit. You do some legork to actually make that work on your device. Or you can buy a premade device with the OS already tuned for it (from Google or whomever). And you can buy a Lustre appliance with the loose bits tied too. Otherwise I feel we'll end up claiming that there's no failover in Lustre because you need to read documentation about how to do it and if you do it improperly - your filesystem would be damaged (doublemount and stuff). And perhaps other examples of the same (HSM is probably magnitudes more nuanced). If you ask me, I feel that single client node SELinux support is in itself of a very limited use, it's like that lock that only keeps the honest people honest. Mount unauthorised or not-SELinux enabled or misconfigured client and the security is no more. To have a robust solution we need to at least: Enforce same policies on the servers (+protocol changes to actually transfer contexts around for every RPC - it's possible that existing bits there would somewhat work) Authenticate clients and assign different roles (in SELinux policy speak) to different classes of clients (also some sort of kernel code signing?) Disallow (or put in a separate role?) non-SELinux enabled clients. Either replicate file contexts on OSTs (inflexible) or do access validation via MDS (more expensive) to avoid bad clients coming straight to OSTs and trying to read objects directly. Do server validation (to ensure nobody registers phoney servers). There are probably other must-haves, that I cannot think off the top of my head.

            Hi,

            Enabling SELinux on a single node is in itself a source of complication for administrators: things that were just working out of the box without SELinux simply do not do anymore, and require fine security tuning to be fully functional again. Then one can imagine that enabling SELinux on a whole cluster necessarily implies administrative headaches. So I think we have to make clear that sites enabling SELinux on Lustre client nodes will inevitably see an increase in their support work. SELinux is complex and may require dedicated security administrators to maintain clusters in operational security conditions.
            With appropriate documentation, the current implementation of SELinux support on Lustre client node could be considered as a tech preview with some restrictions.

            This ticket could be the occasion to discuss the following technical points:
            a) [atomicity]
            b) [consistent file system view from different clients]
            I consider [performance] and [recovery] will be solved when [atomicity] is addressed.

            a) [atomicity]
            In gerrit (http://review.whamcloud.com/11648), my proposition to address atomicity is to use a new security primitive (available from rhel7) named security_dentry_init_security(). It could be called from ll_lookup_it() as it only needs a dentry. Then security information could be sent with the lookup request (presumably creating a new request type and/or making a protocol change) so that the MDT can set the xattr in the same transaction as the file creation.
            This proposition is only valid for rhel7, so far I cannot see an equivalent for rhel6.

            b) [consistent file system view from different clients]
            Coherency could be addressed thanks to a lock: setting or updating a security context would require a RW lock, and accessing a file would require a RO lock. What is unclear to me is the following: can we use for that an already existing lock type, or should we create a new lock type? Moreover, is there a description available somewhere about how to take and release locks in Lustre? or maybe some intelligible parts of the code that could be taken as an example to implement what we need here?

            Of course, any technical remark or proposition would be much appreciated.
            Thanks,
            Sebastien.

            sebastien.buisson Sebastien Buisson (Inactive) added a comment - Hi, Enabling SELinux on a single node is in itself a source of complication for administrators: things that were just working out of the box without SELinux simply do not do anymore, and require fine security tuning to be fully functional again. Then one can imagine that enabling SELinux on a whole cluster necessarily implies administrative headaches. So I think we have to make clear that sites enabling SELinux on Lustre client nodes will inevitably see an increase in their support work. SELinux is complex and may require dedicated security administrators to maintain clusters in operational security conditions. With appropriate documentation, the current implementation of SELinux support on Lustre client node could be considered as a tech preview with some restrictions. This ticket could be the occasion to discuss the following technical points: a) [atomicity] b) [consistent file system view from different clients] I consider [performance] and [recovery] will be solved when [atomicity] is addressed. a) [atomicity] In gerrit ( http://review.whamcloud.com/11648 ), my proposition to address atomicity is to use a new security primitive (available from rhel7) named security_dentry_init_security(). It could be called from ll_lookup_it() as it only needs a dentry. Then security information could be sent with the lookup request (presumably creating a new request type and/or making a protocol change) so that the MDT can set the xattr in the same transaction as the file creation. This proposition is only valid for rhel7, so far I cannot see an equivalent for rhel6. b) [consistent file system view from different clients] Coherency could be addressed thanks to a lock: setting or updating a security context would require a RW lock, and accessing a file would require a RO lock. What is unclear to me is the following: can we use for that an already existing lock type, or should we create a new lock type? Moreover, is there a description available somewhere about how to take and release locks in Lustre? or maybe some intelligible parts of the code that could be taken as an example to implement what we need here? Of course, any technical remark or proposition would be much appreciated. Thanks, Sebastien.

            People

              wc-triage WC Triage
              panda Andrew Perepechko
              Votes:
              0 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: