[LU-16524] Limit capabilities of local admin Created: 02/Feb/23  Updated: 05/Jun/23  Resolved: 20/May/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.16.0
Fix Version/s: Lustre 2.16.0

Type: Improvement Priority: Minor
Reporter: Sebastien Buisson Assignee: Sebastien Buisson
Resolution: Fixed Votes: 0
Labels: patch, sec

Attachments: Text File lctl-get_param-nodemap-49907.txt     Text File lctl-get_param-nodemap-50184.txt    
Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

We might need to support the use case of a 'local' admin, that is root on the client, also root on Lustre to achieve some tasks such as changing files' owner or group (so root squash cannot be used) but still restricted in some privileged actions (e.g. lfs commands).



 Comments   
Comment by Gerrit Updater [ 02/Feb/23 ]

"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49873
Subject: LU-16524 nodemap: add rbac property to nodemap
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2b87dae7d926b0c244c75cde5cb0a1fac7c55a84

Comment by Gerrit Updater [ 06/Feb/23 ]

"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49907
Subject: LU-16524 sec: enforce rbac roles
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: fdd15c6a7af7bc0569b1d342c78b0d4b622ed7a5

Comment by Gerrit Updater [ 02/Mar/23 ]

"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50184
Subject: LU-16524 sec: add fscrypt_admin rbac role
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5958dcbea1ded00b645da117f6f68cf86ba6168f

Comment by Shuichi Ihara [ 07/Mar/23 ]

here is mount failure after patch https://review.whamcloud.com/#/c/fs/lustre-release/+/50184/

nodemap configuration

lctl nodemap_activate 0

lctl nodemap_del trusted
lctl nodemap_del tenant100

lctl nodemap_modify --name default --property trusted --value 0
lctl nodemap_modify --name default --property admin --value 0
lctl nodemap_modify --name default --property deny_unknown --value 1

lctl nodemap_add trusted
lctl nodemap_add_range --name trusted --range 192.168.200.[1-254]@tcp
lctl nodemap_modify --name trusted --property trusted --value 1
lctl nodemap_modify --name trusted --property admin --value 1
lctl nodemap_modify --name trusted --property deny_unknown --value 0

lctl nodemap_add tenant100
lctl nodemap_add_range --name tenant100 --range 192.168.100.[1-254]@tcp
lctl nodemap_modify --name tenant100 --property admin --value 0

lctl nodemap_activate 1

make sure nid belongs to nodemap "tenant100"

[root@server ~]#  lctl nodemap_test_nid 192.168.100.2@tcp
tenant100

On client (192.168.100.2@tcp)

root@client:~# lnetctl net show
net:
    - net type: lo
      local NI(s):
        - nid: 0@lo
          status: up
    - net type: tcp
      local NI(s):
        - nid: 192.168.100.2@tcp
          status: up
          interfaces:
              0: enp5s0

root@client:~# mount -t lustre 192.168.200.2@tcp:/lustre /lustre
mount.lustre: mount 192.168.200.2@tcp:/lustre at /lustre failed: Permission denied

syslog shows below

Mar  7 05:28:59 client-tenant100 kernel: [3934814.817595] LustreError: 226554:0:(llite_lib.c:711:client_common_fill_super()) lustre-clilmv-ffff9df1ccbbe000: md_getattr failed for root: rc = -13
Mar  7 05:28:59 client-tenant100 kernel: [3934814.851720] Lustre: Unmounted lustre-client
Mar  7 05:28:59 client-tenant100 kernel: [3934814.852581] LustreError: 226554:0:(super25.c:187:lustre_fill_super()) llite: Unable to mount <unknown>: rc = -13

Without patches, mount worked as expected even same nodemap policy applied.

root@client:~# mount -t lustre 192.168.200.2@tcp:/lustre /lustre
Comment by Sebastien Buisson [ 07/Mar/23 ]

Thanks Shuichi for the heads-up.

Could you please dump the whole nodemap configuration with this command, as I really need to know how it is setup?

# lctl get_param -R 'nodemap.*'

In particular, could you please check that the squashed UID and GID do exist on client and server sides, with this command, run on both client and server sides?

# id <squashed uid and gid>

Moreover, how is configured identity upcall?

# lctl get_param mdt.*-MDT*.identity_upcall

Patch https://review.whamcloud.com/50184 is the topmost one in a series of 3 patches. When you say "without patches", does it mean you are rebuilding without just #50184, or you are also removing #49907 and #49873? By the way, are you using the tip of master branch? Can you please provide the reference to your current HEAD?
It is also important for me to understand how the problem you see happens. Are you reformatting after rebuilding, and so starting with a brand new file system, or was it formatted before, and then it is upgraded?

Thanks,
Sebastien.

Comment by Shuichi Ihara [ 07/Mar/23 ]

This different behavior (not able to mount if root is squashed) changes was after patch 50184 applied.

with patch 49907  here is lctl get_param -R 'nodemap.*' lctl-get_param-nodemap-49907.txt

[root@server ~]# lctl get_param mdt.*-MDT*.identity_upcall
mdt.lustre-MDT0000.identity_upcall=NONE
mdt.lustre-MDT0001.identity_upcall=NONE

root@client:~# id 99 
id: ‘99’: no such user

root@client:~# mount -t lustre 192.168.200.2@tcp:/lustre /lustre
root@client:~# ls /lustre  ls: cannot access '/lustre': Permission denied

client was able to mount, it can't access filesystem.

with patch 50184 lctl get_param -R 'nodemap.*' is attached lctl-get_param-nodemap-50184.txt

[root@server ~]# lctl get_param mdt.*-MDT*.identity_upcall
mdt.lustre-MDT0000.identity_upcall=NONE
mdt.lustre-MDT0001.identity_upcall=NONE

[root@server ~]# id 99
id: ‘99’: no such user

root@client:~# id 99
id: ‘99’: no such user

root@client:~# mount -t lustre 192.168.200.2@tcp:/lustre /lustre
mount.lustre: mount 192.168.200.2@tcp:/lustre at /lustre failed: Permission denied

client was not able to mount even identity_upcall=NONE.

It is also important for me to understand how the problem you see happens. Are you reformatting after rebuilding, and so starting with a brand new file system, or was it formatted before, and then it is upgraded?

When new build installed, I reformmated all OST/MDTs and re-applied nodemap setting below all time.

lctl nodemap_activate 0

lctl nodemap_del trusted
lctl nodemap_del tenant100

lctl nodemap_modify --name default --property trusted --value 0
lctl nodemap_modify --name default --property admin --value 0
lctl nodemap_modify --name default --property deny_unknown --value 1

lctl nodemap_add trusted
lctl nodemap_add_range --name trusted --range 192.168.200.[1-254]@tcp
lctl nodemap_modify --name trusted --property trusted --value 1
lctl nodemap_modify --name trusted --property admin --value 1
lctl nodemap_modify --name trusted --property deny_unknown --value 0

lctl nodemap_add tenant100
lctl nodemap_add_range --name tenant100 --range 192.168.100.[1-254]@tcp
lctl nodemap_modify --name tenant100 --property admin --value 0

lctl nodemap_activate 1
Comment by Gerrit Updater [ 08/Mar/23 ]

"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50230
Subject: LU-16524 nodemap: filter out unknown records
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: cc28404fc05d36d7bdb55c032fd84adb84799511

Comment by Sebastien Buisson [ 08/Mar/23 ]

I have been testing that further, and found an interesting behavior. Without any LU-16524 patches applied, it is not possible for a squashed root to mount the client if the squash uid or gid does not exist on server side, when SELinux is enabled on the client (either Permissive or Enforced). This is because the client will issue a getxattr request for security.selinux, and on server side user credentials are checked for a getxattr.

With LU-16524 patches applied (especially #50184), even when SELinux is not enabled on the client, it is not possible for a squashed root to mount the client if the squash uid or gid does not exist on server side. This is because patch #50184 adds a user credentials check on getattr.

So I agree patch #50184 introduces a behavior change (only when SELinux is not enabled on the client), but I am not shocked to proceed to a user credentials check on getattr. Moreover, it is questionnable to mount a client as a mis-configured root (a squashed root with its squashed uid or gid that does not exist on server side), given that no further operation is possible on the file system. I would tend to think this is a nodemap configuration error and it should not be supported.

Comment by Gerrit Updater [ 21/Mar/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49873/
Subject: LU-16524 nodemap: add rbac property to nodemap
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 5e48ffca322c3c72d3b83b0719f245fc6f13c8e4

Comment by Gerrit Updater [ 21/Mar/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49907/
Subject: LU-16524 sec: enforce rbac roles
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 971e025f5fb77f4eaaa1e9070598dfa6292a9678

Comment by Gerrit Updater [ 21/Mar/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50184/
Subject: LU-16524 sec: add fscrypt_admin rbac role
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 22bef9b6c64ef394a2efb41ce1388be71300af0d

Comment by Gerrit Updater [ 11/Apr/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50230/
Subject: LU-16524 nodemap: filter out unknown records
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 170ddaa96bdddea32a7c48ddacefc53d961cc783

Comment by Andreas Dilger [ 24/Apr/23 ]

Sebastien, does it make sense for subdirectory mounts that the squashed root on the client is still able to chown/chmod/chgrp for files in the subdirectory tree IF they are for UID/GID/PROJID mapped in the nodemap (excluding root itself)? Similarly, it should only be possible for the squashed root to adjust quotas for UID/GID/PROJIDs that are mapped by the nodemap.

That would allow the root user on the client to do normal admin tasks for files in that project/container, without needing to grant them "real" root access to the filesystem itself (i.e. admin=1).

I think doing the mapped UID/GID/PROJID lookup is fast (this is already done for every RPC) so the only extra check would be whether it was the squashed root user on the client. Maybe an extra "squashed_admin=1" setting or similar?

Comment by Andreas Dilger [ 28/Apr/23 ]

Is there anything left to do on this ticket, or should it be marked Resolved/Fixed? My last comment/question about restricting RBAC admin operations to IDs within the nodemap could be addressed in a separate ticket, though it would be nice to also get this into the 2.16 release so that the behavior is consistent.

Comment by Sebastien Buisson [ 28/Apr/23 ]

Hi Andreas,

If I understand correctly your comment in https://jira.whamcloud.com/browse/LU-16524?focusedCommentId=370402&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-370402, that would consist in extending the capabilities of a squashed (root) user so that it can still modify file permissions and owners and quotas, if they correspond to a mapped uid/gid/projid.

It seems to be quite the opposite idea of what we implemented with this ticket. The rbac roles are designed to limit the powers of not-squashed root, by preventing modifications of file permissions and owners (file_perms role), or quota modifications (quota_ops role) for instance.

I am not saying that extending the capabilities of a squashed (root) user would not be an interesting feature to have. But I think it is a different approach that should be tackled under a different ticket.

Comment by Andreas Dilger [ 28/Apr/23 ]

You are right, this would be somewhat the opposite approach, with the benefit that it would "grant" select privileges to the tenant admin, starting from "nothing" that the regular user has, so would be more "fail safe". The current approach will take away privileges from a root user, but risks that something was missed, or is added in the future that does not add RBAC roles/checks and cannot be squashed/removed.

That said, it definitely belongs in a different ticket so that this one can be marked resolved..

Comment by Peter Jones [ 20/May/23 ]

Seems like this body of work has merged for 2.16

Generated at Sat Feb 10 03:27:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.