[LU-11850] Relocating /proc/fs/lustre/ost to /sys/kernel/debug/lustre/ost prevents non-root access Created: 10/Jan/19  Updated: 26/Jan/24

Status: In Progress
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0
Fix Version/s: Upstream

Type: Bug Priority: Minor
Reporter: Mahmoud Hanafi Assignee: James A Simmons
Resolution: Unresolved Votes: 1
Labels: llnl

Issue Links:
Duplicate
is duplicated by LU-11988 Some lctl *_param values only readabl... Resolved
Related
is related to LU-8066 Move lustre procfs handling to sysfs ... Open
is related to LU-9680 Improve the user land to kernel space... In Progress
is related to LU-17471 Add symlink /proc/fs/lustre/osd-*/*/b... Open
is related to LU-11988 Some lctl *_param values only readabl... Resolved
is related to LU-12244 lfs check subcommand no longer works ... Resolved
is related to LU-12513 Address YAML limitations Resolved
is related to LU-12841 Add ability for lctl get_param to get... Open
is related to LU-17472 Create python wrappers for liblustreapi Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

For security reasons /sys/kernel/debug is restrict to root only so by relocating /proc/fs/lustre/ost & mdt to /sys/kenrnel/debug/lustre breaks many tools such as 'performance co pilot" that run as non-privilege users. We rely on such tools to collect lustre metric.

We could change the permissions on /sys/kernel/debug but that is not good security practice. Can there be a build option to selected the location?



 Comments   
Comment by Jay Lan (Inactive) [ 10/Jan/19 ]

I think the question we should ask is "should the information revealed in lustre/ost and lustre/mdt be kept away from non-privileged users?" If not, maybe we can reconsider the decision and move the information back to procfs? Or other sysfs location that do not require root privilege to read?

Comment by James A Simmons [ 10/Jan/19 ]

Proc is the worst place for this kind of information. Not only is frowned on by the kernel maintainers but if we every placed a large amount of data into an seq_file, which is used by procfs, the node would be brought to its knees by the cpu load. Actually this has been under discussion on the lustre-devel mailing list. If you are talking about "stats" I'm working a netlink implementation which is way more flexible and has better performance. Give me a few days to roll something out.

Comment by James A Simmons [ 11/Jan/19 ]

BTW here is an excellent article about a solution which I'm modeling my work after.

https://lwn.net/Articles/406975

The article cover why stats in proc is bad.

Comment by Peter Jones [ 11/Jan/19 ]

ok James

Comment by Andreas Dilger [ 13/Jan/19 ]

We could change the permissions on /sys/kernel/debug but that is not good security practice.

There is no need to change the permissions for the whole of /sys/kernel/debug to be world readable. Currently, it looks like /sys/kernel/debug is itself the only directory that blocks access. It would be possible in the short term to recursively change the permissions of this tree to remove world-readable permissions ("chmod -R go-rw /sys/kernel/debug") and then enable group access permissions for the monitoring tools to the Lustre tree after the filesystem is mounted ("chmod -R g+rX /sys/kernel/debug/lustre; chgrp -R collectd /sys/kernel/debug/lustre" or similar).

Comment by Jay Lan (Inactive) [ 14/Jan/19 ]

James, this is great! Performance is very important! Thanks!

Comment by Mahmoud Hanafi [ 15/Jan/19 ]

Why is /sys/kernel/debug/lustre not located at /sys/kernel/lustre?

Comment by James A Simmons [ 15/Jan/19 ]

Because all debugfs files go into /sys/kernel/debug. That is the mount point.

Comment by Mahmoud Hanafi [ 15/Jan/19 ]

But why are we considering /sys/kenrel/debug/lustre/ost/... part of "debugging"

 

Comment by James A Simmons [ 15/Jan/19 ]

The kernel has rules about what can be in sysfs. An excellent article covering these rules is here:

https://lwn.net/Articles/378884

Since Lustre has complex data files they are not allowed in sysfs. So the quick fix done for the linux client was moving it to debugfs . The point of this policy was due to proc becoming a dumpster. Now the dumpster is debugfs  Note I have been avoiding the move of several files like stats to debugfs for the OpenSFS tree.

No fear netlink will resolve these issues. I have a prototypes partially working. I just need to work out the nesting of data. I see its the ptlrpc service stats.

Comment by James A Simmons [ 30/Jan/19 ]

I managed to get the basics working using netlink with obd stats. Just need to figure out how to link into the ptlrpc service.

Comment by Gerrit Updater [ 14/Feb/19 ]

James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/34256
Subject: LU-11850 obd: use netlink to get lustre stats
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6c74e4ed15ad654b4a20925bd36b8cc0e014d34c

Comment by James A Simmons [ 14/Feb/19 ]

Just pushed a prototype patch which I'm going to use to discsuss Netlink API with other developers. It does sort of work with just md_stats but more is needed.

Comment by James A Simmons [ 14/Oct/21 ]

So I have been doing research into the different stat collectors out their. From what I see you can configure them to collect the data from the lustre utilities instead of attempting to read from the debugfs files directly. For example for collectd you would use:

<Plugin exec>

    Exec "myuser:mygroup" "myprog"

   Exec "otheruser" "/path/to/another/binary" "arg0" "arg1"

   NotificationExec "user" "/usr/lib/collectd/exec/handle_notification"

</Plugin>

Looking at LMT and performance co pilot it looks to be the same case. If we can get are utilities to work without root access we should be in good shape.

 

Comment by Gerrit Updater [ 16/Aug/23 ]

"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51959
Subject: LU-11850 lov: migrate completely to lu_tgt_descs API
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8e49bf0a866c9214ac72bb85e2c49557615a3dd4

Comment by Jean-Thomas Acquaviva [ 23/Jan/24 ]

We would need to provide access to the statistics on the client side from non-root accesses. Such access is mandatory for many performance tools running in user space.

Probably the patch is of modest size, is it possible to have a hot-fix?

Comment by Andreas Dilger [ 23/Jan/24 ]

JT, the change of /sys/kernel/debug to root-only happened in the upstream kernel after Lustre started using it, so it would need a kernel patch on all clients (AFAIK), that I don't think anyone wants.

I haven't if there is an easy way to restructure the code back to using /proc/fs/lustre to make these stats available again, but that would probably be the least disruptive code change. The other option would be to add a dedicated "lparamfs" to hold all the Lustre stats so we don't have to deal with the kernel restrictions at all.

Comment by James A Simmons [ 23/Jan/24 ]

Hello. Since you brought this up I do have a patch - https://review.whamcloud.com/c/fs/lustre-release/+/34256. After reading this I tracked down the crash I was seeing which was due to a really large message. Stats can be really huge amount of data. I need to adjust the skb. I do need to do more testing for what global_match() can handle.

Comment by James A Simmons [ 24/Jan/24 ]

I updated patch 34256. Not perfect but is mostly works now. Give it a try. Note you will need the latest Lustre to make this work.

Comment by James A Simmons [ 24/Jan/24 ]

Almost done with the patch for stats. Only bug left is if you grab ALL stats it overflows the liblnetconfig library. I do want to move the internal storage of the stats structures as an Xarray instead of a generic_radix struct.

Generated at Sat Feb 10 02:47:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.