[LU-10444] l_getidentity keeps remount /sys/kernel/debug and reverting permissions. Created: 29/Dec/17  Updated: 09/Feb/18  Resolved: 14/Jan/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.1
Fix Version/s: Lustre 2.11.0, Lustre 2.10.4

Type: Bug Priority: Major
Reporter: Mahmoud Hanafi Assignee: Oleg Drokin
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

We change the permissions of /sys/kernel/debug to 755. But it kept revering to 700.

Using systemtap script

#!/usr/bin/env stap

probe kernel.function("debug_mount") {
 printf("mounting ")
 printf("pid %i %s %s %s \n", pid(),execname(), cmdline_str(), caller())
 print_backtrace()
}

probe kernel.function("debugfs_remount") {
 printf("remounting ")
 printf("pid %i %s %s %s \n", pid(),execname(), cmdline_str(), caller())
 print_backtrace()
}


I tracked this down to l_getidentity. Specificly cfs_get_param_paths keeps remounting /sys/kernel/debug.

# mount -o remount,mode=755 /sys/kernel/debug
debugfs on /sys/kernel/debug type debugfs (rw,relatime,mode=755)

Here is the output from the systemtap

nbp1-mds ~ # stap -v debug_mount.stp 
Pass 1: parsed user script and 468 library scripts using 122940virt/39548res/3192shr/36448data kb, in 340usr/20sys/351real ms.
Pass 2: analyzed script: 2 probes, 22 functions, 5 embeds, 0 globals using 161632virt/79356res/4388shr/75140data kb, in 580usr/80sys/665real ms.
Pass 3: translated to C into "/tmp/stapYTzkIl/stap_6d34b35e04d11e38ae0c8b364ac253de_10487_src.c" using 162160virt/80148res/4692shr/75668data kb, in 340usr/10sys/353real ms.
Pass 4: compiled C into "stap_6d34b35e04d11e38ae0c8b364ac253de_10487.ko" in 5660usr/860sys/5857real ms.
Pass 5: starting run.
mounting pid 9739 l_getidentity "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" mount_fs 0xffffffff811fec69 
 0xffffffff81289120 : debug_mount+0x0/0x20 [kernel]
 0xffffffff811fec69 : mount_fs+0x39/0x1b0 [kernel]
 0xffffffff8121b5c7 : vfs_kern_mount+0x67/0x110 [kernel]
 0xffffffff8121dad3 : do_mount+0x233/0xaf0 [kernel]
 0xffffffff8121e716 : SyS_mount+0x96/0xf0 [kernel]
 0xffffffff81695b89 : system_call_fastpath+0x16/0x1b [kernel]
remounting pid 9739 l_getidentity "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" do_remount_sb 0xffffffff811fe7c2 
 0xffffffff81289330 : debugfs_remount+0x0/0x50 [kernel]
 0xffffffff811fe7c2 : do_remount_sb+0x72/0x200 [kernel]
 0xffffffff811feb07 : mount_single+0x57/0xc0 [kernel]
 0xffffffff81289138 : debug_mount+0x18/0x20 [kernel]
 0xffffffff811fec69 : mount_fs+0x39/0x1b0 [kernel]
 0xffffffff8121b5c7 : vfs_kern_mount+0x67/0x110 [kernel]
 0xffffffff8121dad3 : do_mount+0x233/0xaf0 [kernel]
 0xffffffff8121e716 : SyS_mount+0x96/0xf0 [kernel]
 0xffffffff81695b89 : system_call_fastpath+0x16/0x1b [kernel]
mounting pid 9779 l_getidentity "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" mount_fs 0xffffffff811fec69 
 0xffffffff81289120 : debug_mount+0x0/0x20 [kernel]
 0xffffffff811fec69 : mount_fs+0x39/0x1b0 [kernel]
 0xffffffff8121b5c7 : vfs_kern_mount+0x67/0x110 [kernel]
 0xffffffff8121dad3 : do_mount+0x233/0xaf0 [kernel]
 0xffffffff8121e716 : SyS_mount+0x96/0xf0 [kernel]
 0xffffffff81695b89 : system_call_fastpath+0x16/0x1b [kernel]
remounting pid 9779 l_getidentity "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" do_remount_sb 0xffffffff811fe7c2 
 0xffffffff81289330 : debugfs_remount+0x0/0x50 [kernel]
 0xffffffff811fe7c2 : do_remount_sb+0x72/0x200 [kernel]
 0xffffffff811feb07 : mount_single+0x57/0xc0 [kernel]
 0xffffffff81289138 : debug_mount+0x18/0x20 [kernel]
 0xffffffff811fec69 : mount_fs+0x39/0x1b0 [kernel]
 0xffffffff8121b5c7 : vfs_kern_mount+0x67/0x110 [kernel]
 0xffffffff8121dad3 : do_mount+0x233/0xaf0 [kernel]
 0xffffffff8121e716 : SyS_mount+0x96/0xf0 [kernel]
 0xffffffff81695b89 : system_call_fastpath+0x16/0x1b [kernel]


 Comments   
Comment by Oleg Drokin [ 29/Dec/17 ]

hm.. Yes, this is a bug in this patch https://review.whamcloud.com/25182.- we really should be checking if the fs is mounted before trying to mount it as the first step.

Let me see if I can make a simple patch.

Comment by Gerrit Updater [ 30/Dec/17 ]

Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: https://review.whamcloud.com/30675
Subject: LU-10444 utils: Don't remount debugfs every time
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 64e656deebdc977593c6d43e7a5ee50c045803dc

Comment by James A Simmons [ 02/Jan/18 ]

That is really strange. You would think it would return -EBUSY in that case instead of remounting.

Comment by Mahmoud Hanafi [ 02/Jan/18 ]

The only reason we needed to change  permissions on /sys/kernel/debug was, user level access need for  '/sys/kernel/debug/lustre/devices.' Why was this file moved from /proc/sys/fs/lustre/devices to /sys/kernel/debug/lustre/devices. Is this the correct place for this file? We think relocating this file should be considered.

-Mahmoud

 

Comment by Andreas Dilger [ 03/Jan/18 ]

James, is it possible to create the devices file with 755 permissions from the start?

Comment by James A Simmons [ 03/Jan/18 ]

Andreas yes you can change the permissions with the mount() function by using the last parameter and passing in the mode value. The question is this a good thing. The problem is so much devices but the debugfs mount point itself. Its mounted for root access only by default. Their is a reason debugfs is not accessible to non-root users. I do understand what the problem is now. I still think Oleg's approach is better.

Mahmoud the file was moved due to a requirement from the linux kernel maintainers. All special files in /proc are in the process of being  moved into /sysfs in the linux kernel. In the case of "devices" it breaks the one item per file rule for sysfs so it has to be placed into debugfs. Now what you are doing is not the best idea for security reasons but I see why you are doing it. The question is do non root users really/ need to access "devices" or the other case stat files in debugfs? Other than that all the other files look administrative.

Mahmoud are you using lctl device_list or directly accessing /proc. The reason I ask is that lctl device_list will attempt to access the debugfs file first and if it fails call an ioctl. If lctl device_list doesn't work for you then its really broken.

Mahmoud are you using lctl device_list or directly accessing /proc. The reason I ask is that lctl device_list will attempt to access the 

Comment by Mahmoud Hanafi [ 03/Jan/18 ]

We have nagios monitoring scripts that read the file directly. The script ran as a regular usr. I guess we could scan /proc/fs/lustre to get the same info.

Comment by James A Simmons [ 03/Jan/18 ]

Can you tell me if lctl device_list works for you? You shouldn't be reading procfs/sysfs/debugfs files directly

Comment by Oleg Drokin [ 03/Jan/18 ]

/proc/fs/lustre is going away, so don't depend on it being there for a long time.

The content is moving to /sys/fs/lustre for simple one file per value cases, everything else is removed or moved to debugfs.

Comment by Andreas Dilger [ 03/Jan/18 ]

Janes, I was thinking that it would be possible to change the permission of only the devices file at create time, like is possible for files in /proc and /sys. If the permission is for the whole filesystem then I agree it might be more problematic.

That said, read access shouldn't be a huge problem. What files are currently in debugfs?

Comment by Bob Glossman (Inactive) [ 03/Jan/18 ]

default permissions of root of mounted debugfs is hard coded to be DEBUGFS_DEFAULT_MODE (0700),

Comment by Oleg Drokin [ 03/Jan/18 ]

It's most definitely possible to change default permission of files in debugfs, but /sys/kernel/debug itself is 700 and we cannot really do anything about it from Lustre

Comment by Mahmoud Hanafi [ 03/Jan/18 ]

you can change the permissions via systemd.

/etc/systemd/system/sys-kernel-debug.mount.

# $id:$
# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.

[Unit]
Description=Debug File System
Documentation=https://www.kernel.org/doc/Documentation/filesystems/debugfs.txt
Documentation=http://www.freedesktop.org/wiki/Software/systemd/APIFileSystems
DefaultDependencies=no
ConditionPathExists=/sys/kernel/debug
ConditionCapability=CAP_SYS_RAWIO
Before=sysinit.target

[Mount]
What=debugfs
Where=/sys/kernel/debug
Type=debugfs
Options=mode=755

From security point of view this could be an issue.

Comment by James A Simmons [ 03/Jan/18 ]

Well we could in theory remount it with 644 permission using the mount() function which will give us what we have today with proc but that requires root privileges to start with   Its kind of fluke we give normal users full read access to all tunables. I never heard of our users every looking at /proc entries. I fact I doubt they even know if the lctl get_param option exist. I did test lctl devices_list locally and found also the ioctl to get the obd device list also only works with root access.

So the reason for this ticket is that Mahmoud attempted to access the "devices" file as a non-root user. Is this kosher?

Comment by James A Simmons [ 03/Jan/18 ]

Mahmoud is the only reason for changing debugfs root permissions is the device file?

Comment by Mahmoud Hanafi [ 03/Jan/18 ]

Correct the only reason we needed to change permissions was the device file, because our tools broke when it was moved. For now I wrote a work around to scan the /proc/fs/lustre and build the device list.

I think the patch here does the job by persevering existing mount options.

Typically our monitoring tools run as non-root so having read access to /proc/ is critical

-Mahmoud

Comment by James A Simmons [ 03/Jan/18 ]

Can post which lustre proc files you monitor. I'm working on a patch to restore lctl device_list as non-root.

Comment by Andreas Dilger [ 04/Jan/18 ]

I don't think it is ok to allow read access only to specific files. That means the monitoring tools need to run as root, which some sites would prefer to avoid. Also, some lfs commands may need to be able to read the /proc files as well. I don't think there is a security issue for reading most of the files, but we used to have individual permissions for each file and that would need to be reviewed.

Comment by Gerrit Updater [ 14/Jan/18 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30675/
Subject: LU-10444 utils: Don't remount debugfs every time
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 767f03b161ae44bd9d33dae7e03e71e73852813f

Comment by Peter Jones [ 14/Jan/18 ]

Landed for 2.11

Comment by Gerrit Updater [ 17/Jan/18 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30900
Subject: LU-10444 utils: Don't remount debugfs every time
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 4a567792a13668d1872481c36f365912253f751e

Comment by Gerrit Updater [ 09/Feb/18 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30900/
Subject: LU-10444 utils: Don't remount debugfs every time
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: 82bf22a4200ba657465302daf6a77b9ebd6b7853

Generated at Sat Feb 10 02:35:09 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.