[LU-1778] Root Squash is not always properly enforced Created: 22/Aug/12 Updated: 28/Feb/23 Resolved: 09/May/14 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.1, Lustre 2.1.2 |
| Fix Version/s: | Lustre 2.6.0, Lustre 2.5.4 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Alexandre Louvet | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 8532 | ||||||||||||||||
| Description |
|
On a node with root_squash activated, if root try to access to attributes of file (fstat) which has not been previously accessed, the operation return ENOPERM. as root : then, as user 'slurm' : now, come back as user root an replay the 'ls' command : At this point if you try to have a look into the file as root, you get ENOPERM But, if the file is opened by the user ('tail -f afile' for exemple), root get access to the content of the file as well As soon as the file is closed by the user, root left access to the content(at least can't open the file any more) Alex. |
| Comments |
| Comment by Peter Jones [ 23/Aug/12 ] |
|
Bob Could you please look into this one? Thanks Peter |
| Comment by Diego Moreno (Inactive) [ 26/Sep/12 ] |
|
Hi, any news on this ticket? Do you need some more information? |
| Comment by Bob Glossman (Inactive) [ 01/Oct/12 ] |
|
I haven't been able to reproduce this failure in the current b2_1: [root@centos53 ~]# mount -t lustre centos53:/lustre /mnt/lustre [root@centos53 ~]# lctl get_param mdt/*/root_squash [root@centos53 ~]# cd /mnt/lustre Am I doing something incorrect in my reproduction attempt? Is there some other precondition to making this happen? |
| Comment by Alexandre Louvet [ 02/Oct/12 ] |
|
This is worst than in my case ... 1/ as root has been remapped to something different than 0:0, I would expect that you wasn't able to enter in bogl directory That said, I did a new test on a vanilla 2.1.3 (ie the rpm downloaded from whamcloud, without recompilation) on top of centos 6.x up to date to confirm that it fail at the latest available version. [root@server ~]# lctl get_param mdt/*/root_squash => set root_squash to an id which doesn't match my user id [root@server ~]# lctl conf_param scratch1.mdt.root_squash="65535:65535" On the client, running as a simple user now log as root on the client [root@client scratch1]# pwd => There is already something funny at this point. As root was mapped to 65535:65535, I expect to not be able to enter in this directory (700) [it was also shown in your test]. Flushing the cache on the client (ie echo 3 > /proc/sys/vm/drop_caches make a different situation. Root can enter the 'test' directory, but can't stat files : [root@client test]# ls -la I imagine this is due to the fact that the uid:gid translation in 'only' made at the mdt side and not at client side, letting root to access to attribute in client side cache without problem. Am I right ? Whatever, return as test user and stat the 'afile' again switch back as root and run 'ls' once again let root again access to attributes : [root@client test]# ls -la at this point root can't access to 'afile' content unless an authorized user run tail -f afile, and keep it running [root@client test]# cat afile |
| Comment by Bob Glossman (Inactive) [ 02/Oct/12 ] |
|
In my case id 500 == bogl. With root squash set to 500 (bogl) root should be able to see into bogl owned dir and file, and it does. I will retry with setting root squash to some other id. |
| Comment by Bob Glossman (Inactive) [ 02/Oct/12 ] |
|
Have set root_squash to 65535:65535, shown by: [root@centos54 bogl]# lctl set_param mdt/*/root_squash=65535:65535 On client accessing as bogl, tree looks like: [bogl@centos53 lustre-release]$ ll -R /mnt/lustre /mnt/lustre/bogl: Note permissions on dir and file only for bogl (id==500). Accessing as root, I consistently see no access for ls or file content to dir or file: [root@centos53 ~]# ll -R /mnt/lustre I do see access being allowed for cd to the bogl owned dir. stat of the file is initially refused: [root@centos53 bogl]# cd /mnt/lustre/bogl Then after doing a stat of the file as bogl: [bogl@centos53 lustre-release]$ stat /mnt/lustre/bogl/file A later stat of the file as root is allowed: [root@centos53 bogl]# stat file I see no case where access to the file content as root is allowed: [root@centos53 bogl]# cat file This behavior looks consistent in all versions of 2.X right up to master. |
| Comment by Bob Glossman (Inactive) [ 02/Oct/12 ] |
|
correction: on another retry I do see incorrect access to file content allowed. If I do a stat and then a persistent access as bogl: [bogl@centos53 lustre-release]$ stat /mnt/lustre/bogl/file After that a stat and access as root is allowed, at least for a while: [root@centos53 bogl]# stat file It seems to require both a (permitted) stat and file access as bogl before the access that should be forbidden as root gets allowed. |
| Comment by Peter Jones [ 30/Oct/12 ] |
|
Niu is going to look into this one |
| Comment by Niu Yawei (Inactive) [ 30/Oct/12 ] |
|
Hi, Alex As you mentioned, root_suqash is just a server side id remapping (like NFS root_squash), it doesn't affect client cache, so this looks an expected behaviour to me. You need to make sure the cache is cleared before you want the root_squash enforced. (I think it's same for NFS, isn't it?) Thanks. |
| Comment by Alexandre Louvet [ 16/Nov/12 ] |
|
Hi Niu, I made some test with NFS and it works as expected : root (under root_squash) never get access to user data if rights for 'others' are not set. It doesn't depend of the activity of an authorized user on the same client. |
| Comment by Niu Yawei (Inactive) [ 28/Nov/12 ] |
I think NFS client should not know if server is squashing root as well, but there are several reasons that I can think of, which could make NFS root_squash doesn't affected by the client cache much:
Lustre is a strong cache consistency filesystem, we can't afford the extra ACCESS RPC or cache revalidation like NFS does, maybe we can make the client aware of the root_squash setting on server, and let the user to configure if they want always do access check on server side (with sacrificing the performance), but I'm not sure if we have enough resource to implement it for this moment. Anyway, I think we should state the root_squash on caching problem clearly in the manual. Alex, what do you think about? Is this feature (enforce root_squash no matter of caching) very important for you? Or just improving manual is ok for you? Thanks. |
| Comment by Sebastien Buisson (Inactive) [ 29/Nov/12 ] |
|
Hi Niu, We do not ask that setting or unsetting What do you think? Sebastien. |
| Comment by Niu Yawei (Inactive) [ 29/Nov/12 ] |
|
Hi, Sebastien Yes, I agree with you on this. Adding a permission checking hook (and check the squash setting there) for llite and make the llite aware of root_squash setting could save the RPCs to server. In my opinion, this could be a feature enhancement rather than a bug, I'm glad to implement this when the time is available, and if you want propose a patch for this, I'm glad to help on review. Thank you. |
| Comment by Sebastien Buisson (Inactive) [ 05/Dec/12 ] |
|
Hi, I would like to propose a patch to address this issue, so I carried out some tests to try to understand which functions are involved in getting file permissions and granting or not file access. Here is what I did.
In the end, I could not figure out who is in charge of checking file permissions. TIA, |
| Comment by Niu Yawei (Inactive) [ 05/Dec/12 ] |
|
Hi, When there isn't cache on client, permission checking is done on server side on the open RPC(the first two cases), when there is cache on client (no open RPC is needed), permission check is done on client by the permisson checking hook (ll_inode_permission, which is invoked by kernel, see may_open()). |
| Comment by Sebastien Buisson (Inactive) [ 06/Dec/12 ] |
|
So, is it OK if I propose to modify ll_inode_permission() to add a check for some kind of root_squash parameter that would be fetched by the client at mount time and stored somewhere? |
| Comment by Niu Yawei (Inactive) [ 06/Dec/12 ] |
|
Hi, Sebastien, I think it's doable. The current root_squash option is stored in mdt config log (because it's a mds only option), we could probably populate this option into the client config log as well, then the client can be notified whenever the option is changed. Thanks. |
| Comment by Sebastien Buisson (Inactive) [ 18/Dec/12 ] |
|
Hi, |
| Comment by Niu Yawei (Inactive) [ 18/Dec/12 ] |
|
Hi, Sebastien Please look at the mgs_write_log_param(), root_squash is now a PARAM_MDT param, which is stored in the $FSNAME-mdt0001, you might want it be stored in the client log either ($FSNAME-client), I think a simple way is to have the amdinistrator runnig two configure commands: And the other options related to root_squash should be treated carefully as well, such as nosquash_nids. Thanks. |
| Comment by Gregoire Pichon [ 30/Jan/13 ] |
|
I have posted a patch for b2_1 on gerrit: http://review.whamcloud.com/5212 |
| Comment by Gregoire Pichon [ 31/Jan/13 ] |
|
For information, here is the note from Andreas Dilger in the gerrit.
|
| Comment by Alexandre Louvet [ 31/Jan/13 ] |
|
This is also true for NFS, but this is not the problem. Lustre claim to support root_squash (at least there is a chapter in the documentation about this feature) and customer hope that this functionality avoid root to access files for which root user doesn't have access. I agree that root can modify it credential and access to the file, but this is another story. The only real interest of this feature is to avoid root to make stupid actions that will damage the content of the filesystem, but the comportment of the feature should be consistent over time and not change due to the client state. Currently root_squash comportment is confusing and the request is just to make it clean. |
| Comment by Gregoire Pichon [ 07/Feb/13 ] |
|
Excerpt from Andreas comment in the gerrit
Andreas, The current implementation of root squash feature by Lustre is not working as expected by the customer and as specified in "Using Root Squash" section of the Lustre Operations Manual. What do you propose to make progress on this issue ? If you think this feature is senseless, then why not reducing its scope to security configurations only (MDT sec-level), or even remove the feature completely ? My feeling is that we should be able to make it work properly. We could perform the root squashing on the client by overwritting the fsuid and gsuid of the task with the root_squash uid:guid specified on the MDS. These settings could be transmitted to the client either at mount time or each time file attributes are retrieved from the MDS (LDLM_INTENT_OPEN or LDLM_INTENT_GETATTR rpcs for instance). The patch I proposed last week is not suitable. Ok, let's find a better solution. |
| Comment by Andreas Dilger [ 08/Feb/13 ] |
|
Actually, the description in the use manual is correctly describing how the code functions:
Like I wrote in the Gerrit comment, there is nothing that can be done by root squash to prevent access to files when someone has root access on the client. In that case, the root user could "su - other_user" and immediately circumvent all of the checking that was added to squash the root user access. The root squash feature is only to prevent "root" on clients to be able to access and/or modify files owned by root on the filesystem. The same "su - other_user" hole is present for NFS, and the fact that "root" is denied direct access on NFS is like a sheet of paper protecting a bank vault. The OpenSFS UID/GIF mapping and shared-key authentication features being developed by IU could allow for much more robust protection in the future. This would allow mapping users from specific nodes to one set of UIDs that don't overlap with UIDs from other nodes, and with the shared-key node authentication it would be impossible for even root to access files for UIDs that are not mapped to that cluster. If you are interested to follow this design and development, please email me and I will provide meeting and list details. |
| Comment by Alexandre Louvet [ 09/Feb/13 ] |
|
Andreas, I think we are moving away from this ticket objective. I do agree with all points about security limitations of the root_squash feature, but this is not the problem there. The problem is that the manual says that access to root user is only granted to object for which it is allowed and this is not always true. Case for which root try to get read access to object for witch the inode is already in client cache, doesn't get root_squash applied. Client code doesn't have knowledge about root_squash and only apply traditional security checking. The result is that root get access granted or denied depending of the cache content, which is very confusing for users. This is the only reason on the jira ticket. |
| Comment by Gregoire Pichon [ 13/Mar/13 ] |
|
I have posted a patch for master on gerrit: http://review.whamcloud.com/#change,5700 |
| Comment by Gregoire Pichon [ 25/Jul/13 ] |
|
Tests on the patchset 7 and 8 have made the client hang after conf-sanity test_43 (the one for root squash). I have been able to reproduce the hang (after 16 successful runs) and took a dump. It is available on ftp.whamcloud.com in /uploads/ Here are the information I have extracted from the dump.
The unmount command seems hung. Higher part of the stack is due to the dump signal.
crash> bt 2723
PID: 2723 TASK: ffff88046ab98040 CPU: 4 COMMAND: "umount"
#0 [ffff880028307e90] crash_nmi_callback at ffffffff8102d2c6
#1 [ffff880028307ea0] notifier_call_chain at ffffffff815131d5
#2 [ffff880028307ee0] atomic_notifier_call_chain at ffffffff8151323a
#3 [ffff880028307ef0] notify_die at ffffffff8109cbfe
#4 [ffff880028307f20] do_nmi at ffffffff81510e9b
#5 [ffff880028307f50] nmi at ffffffff81510760
[exception RIP: page_fault]
RIP: ffffffff815104b0 RSP: ffff880472c13bc0 RFLAGS: 00000082
RAX: ffffc9001dd57008 RBX: ffff880470b27e40 RCX: 000000000000000f
RDX: ffffc9001dd1d000 RSI: ffff880472c13c08 RDI: ffff880470b27e40
RBP: ffff880472c13c48 R8: 0000000000000000 R9: 00000000fffffffe
R10: 0000000000000001 R11: 5a5a5a5a5a5a5a5a R12: ffff880472c13c08 = struct cl_site *
R13: 00000000000000c4 R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#6 [ffff880472c13bc0] page_fault at ffffffff815104b0
#7 [ffff880472c13bc8] cfs_hash_putref at ffffffffa04305c1 [libcfs]
#8 [ffff880472c13c50] lu_site_fini at ffffffffa0588841 [obdclass]
#9 [ffff880472c13c70] cl_site_fini at ffffffffa0591d0e [obdclass]
#10 [ffff880472c13c80] ccc_device_free at ffffffffa0e6c16a [lustre]
#11 [ffff880472c13cb0] lu_stack_fini at ffffffffa058b22e [obdclass]
#12 [ffff880472c13cf0] cl_stack_fini at ffffffffa059132e [obdclass]
#13 [ffff880472c13d00] cl_sb_fini at ffffffffa0e703bd [lustre]
#14 [ffff880472c13d40] client_common_put_super at ffffffffa0e353d4 [lustre]
#15 [ffff880472c13d70] ll_put_super at ffffffffa0e35ef9 [lustre]
#16 [ffff880472c13e30] generic_shutdown_super at ffffffff8118326b
#17 [ffff880472c13e50] kill_anon_super at ffffffff81183356
#18 [ffff880472c13e70] lustre_kill_super at ffffffffa057d37a [obdclass]
#19 [ffff880472c13e90] deactivate_super at ffffffff81183af7
#20 [ffff880472c13eb0] mntput_no_expire at ffffffff811a1b6f
#21 [ffff880472c13ee0] sys_umount at ffffffff811a25db
#22 [ffff880472c13f80] system_call_fastpath at ffffffff8100b072
RIP: 00007f0e6a971717 RSP: 00007fff17919878 RFLAGS: 00010206
RAX: 00000000000000a6 RBX: ffffffff8100b072 RCX: 0000000000000010
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007f0e6c3cfb90
RBP: 00007f0e6c3cfb70 R8: 00007f0e6c3cfbb0 R9: 0000000000000000
R10: 00007fff179196a0 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 00007f0e6c3cfbf0
ORIG_RAX: 00000000000000a6 CS: 0033 SS: 002b
ccc_device_free() is called on lu_device 0xffff880475ad06c0
crash> struct lu_device ffff880475ad06c0
struct lu_device {
ld_ref = {
counter = 1
},
ld_type = 0xffffffffa0ea22e0,
ld_ops = 0xffffffffa0e787a0,
ld_site = 0xffff880472cf05c0,
ld_proc_entry = 0x0,
ld_obd = 0x0,
ld_reference = {<No data fields>},
ld_linkage = {
next = 0xffff880472cf05f0,
prev = 0xffff880472cf05f0
}
}
ld_type->ldt_tags
crash> rd -8 ffffffffa0ea22e0
ffffffffa0ea22e0: 04 = LU_DEVICE_CL
ld_type->ldt_name
crash> rd ffffffffa0ea22e8
ffffffffa0ea22e8: ffffffffa0e7d09f = "vvp"
lu_site=ffff880472cf05c0
crash> struct lu_site ffff880472cf05c0
struct lu_site {
ls_obj_hash = 0xffff880470b27e40,
ls_purge_start = 0,
ls_top_dev = 0xffff880475ad06c0,
ls_bottom_dev = 0x0,
ls_linkage = {
next = 0xffff880472cf05e0,
prev = 0xffff880472cf05e0
},
ls_ld_linkage = {
next = 0xffff880475ad06f0,
prev = 0xffff880475ad06f0
},
ls_ld_lock = {
raw_lock = {
slock = 65537
}
},
ls_stats = 0xffff880470b279c0,
ld_seq_site = 0x0
}
crash> struct cfs_hash 0xffff880470b27e40
struct cfs_hash {
hs_lock = {
rw = {
raw_lock = {
lock = 0
}
},
spin = {
raw_lock = {
slock = 0
}
}
},
hs_ops = 0xffffffffa05edee0,
hs_lops = 0xffffffffa044e320,
hs_hops = 0xffffffffa044e400,
hs_buckets = 0xffff880471e4f000,
hs_count = {
counter = 0
},
hs_flags = 6184, = 0x1828 = CFS_HASH_SPIN_BKTLOCK | CFS_HASH_NO_ITEMREF | CFS_HASH_ASSERT_EMPTY | CFS_HASH_DEPTH
hs_extra_bytes = 48,
hs_iterating = 0 '\000',
hs_exiting = 1 '\001',
hs_cur_bits = 23 '\027',
hs_min_bits = 23 '\027',
hs_max_bits = 23 '\027',
hs_rehash_bits = 0 '\000',
hs_bkt_bits = 15 '\017',
hs_min_theta = 0,
hs_max_theta = 0,
hs_rehash_count = 0,
hs_iterators = 0,
hs_rehash_wi = {
wi_list = {
next = 0xffff880470b27e88,
prev = 0xffff880470b27e88
},
wi_action = 0xffffffffa04310f0 <cfs_hash_rehash_worker>,
wi_data = 0xffff880470b27e40,
wi_running = 0,
wi_scheduled = 0
},
hs_refcount = {
counter = 0
},
hs_rehash_buckets = 0x0,
hs_name = 0xffff880470b27ec0 "lu_site_vvp"
}
I am going to attach the log of the Maloo test that hung (Jul 19 10:12 PM). |
| Comment by Gregoire Pichon [ 25/Jul/13 ] |
|
client log from Maloo test on patchset 7 (Jul 19 10:12 PM) |
| Comment by Gregoire Pichon [ 04/Dec/13 ] |
|
I have posted another patch that adds a service to print a nidlist: http://review.whamcloud.com/#/c/8479/ . After review of patchset 11 of #5700 patch, it seems to be a requirement. |
| Comment by Cliff White (Inactive) [ 13/Dec/13 ] |
|
Thank you. Would it be possible for you to rebase this on current master? There are a few conflicts preventing merge. |
| Comment by Gregoire Pichon [ 11/Feb/14 ] |
|
The patch #8479 has been landed and then reverted due to a conflit with GNIIPLND patch. I have posted a new version of the patch: http://review.whamcloud.com/9221 |
| Comment by Jodi Levi (Inactive) [ 22/Apr/14 ] |
|
Patch landed to Master |
| Comment by Gregoire Pichon [ 23/Apr/14 ] |
|
This ticket has not been fixed yet. |
| Comment by Peter Jones [ 09/May/14 ] |
|
Now really landed for 2.6. |
| Comment by Gregoire Pichon [ 18/Jun/14 ] |
|
I have backported the two patches to be integrated in 2.5 maintenance release. |
| Comment by Gregoire Pichon [ 01/Sep/14 ] |
|
The two above patches #10743 and #10744 have been posted and are ready for review since end of June. |
| Comment by Gerrit Updater [ 01/Dec/14 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/10743/ |
| Comment by Gerrit Updater [ 01/Dec/14 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/10744/ |
| Comment by Gregoire Pichon [ 24/Aug/16 ] |
|
Closing as the issue has been fixed (several months ago) in master and 2.5 maintenance release. |