Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11850

Relocating /proc/fs/lustre/ost to /sys/kernel/debug/lustre/ost prevents non-root access

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • Upstream
    • Lustre 2.12.0
    • 3
    • 9223372036854775807

    Description

      For security reasons /sys/kernel/debug is restrict to root only so by relocating /proc/fs/lustre/ost & mdt to /sys/kenrnel/debug/lustre breaks many tools such as 'performance co pilot" that run as non-privilege users. We rely on such tools to collect lustre metric.

      We could change the permissions on /sys/kernel/debug but that is not good security practice. Can there be a build option to selected the location?

      Attachments

        Issue Links

          Activity

            [LU-11850] Relocating /proc/fs/lustre/ost to /sys/kernel/debug/lustre/ost prevents non-root access

            I updated patch 34256. Not perfect but is mostly works now. Give it a try. Note you will need the latest Lustre to make this work.

            simmonsja James A Simmons added a comment - I updated patch 34256. Not perfect but is mostly works now. Give it a try. Note you will need the latest Lustre to make this work.

            Hello. Since you brought this up I do have a patch - https://review.whamcloud.com/c/fs/lustre-release/+/34256. After reading this I tracked down the crash I was seeing which was due to a really large message. Stats can be really huge amount of data. I need to adjust the skb. I do need to do more testing for what global_match() can handle.

            simmonsja James A Simmons added a comment - Hello. Since you brought this up I do have a patch - https://review.whamcloud.com/c/fs/lustre-release/+/34256. After reading this I tracked down the crash I was seeing which was due to a really large message. Stats can be really huge amount of data. I need to adjust the skb. I do need to do more testing for what global_match() can handle.

            JT, the change of /sys/kernel/debug to root-only happened in the upstream kernel after Lustre started using it, so it would need a kernel patch on all clients (AFAIK), that I don't think anyone wants.

            I haven't if there is an easy way to restructure the code back to using /proc/fs/lustre to make these stats available again, but that would probably be the least disruptive code change. The other option would be to add a dedicated "lparamfs" to hold all the Lustre stats so we don't have to deal with the kernel restrictions at all.

            adilger Andreas Dilger added a comment - JT, the change of /sys/kernel/debug to root-only happened in the upstream kernel after Lustre started using it, so it would need a kernel patch on all clients (AFAIK), that I don't think anyone wants. I haven't if there is an easy way to restructure the code back to using /proc/fs/lustre to make these stats available again, but that would probably be the least disruptive code change. The other option would be to add a dedicated " lparamfs " to hold all the Lustre stats so we don't have to deal with the kernel restrictions at all.

            We would need to provide access to the statistics on the client side from non-root accesses. Such access is mandatory for many performance tools running in user space.

            Probably the patch is of modest size, is it possible to have a hot-fix?

            jtacquaviva Jean-Thomas Acquaviva added a comment - We would need to provide access to the statistics on the client side from non-root accesses. Such access is mandatory for many performance tools running in user space. Probably the patch is of modest size, is it possible to have a hot-fix?

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51959
            Subject: LU-11850 lov: migrate completely to lu_tgt_descs API
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8e49bf0a866c9214ac72bb85e2c49557615a3dd4

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51959 Subject: LU-11850 lov: migrate completely to lu_tgt_descs API Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8e49bf0a866c9214ac72bb85e2c49557615a3dd4

            So I have been doing research into the different stat collectors out their. From what I see you can configure them to collect the data from the lustre utilities instead of attempting to read from the debugfs files directly. For example for collectd you would use:

            <Plugin exec>

                Exec "myuser:mygroup" "myprog"

               Exec "otheruser" "/path/to/another/binary" "arg0" "arg1"

               NotificationExec "user" "/usr/lib/collectd/exec/handle_notification"

            </Plugin>

            Looking at LMT and performance co pilot it looks to be the same case. If we can get are utilities to work without root access we should be in good shape.

             

            simmonsja James A Simmons added a comment - So I have been doing research into the different stat collectors out their. From what I see you can configure them to collect the data from the lustre utilities instead of attempting to read from the debugfs files directly. For example for collectd you would use: <Plugin exec>     Exec "myuser:mygroup" "myprog"    Exec "otheruser" "/path/to/another/binary" "arg0" "arg1"    NotificationExec "user" "/usr/lib/collectd/exec/handle_notification" </Plugin> Looking at LMT and performance co pilot it looks to be the same case. If we can get are utilities to work without root access we should be in good shape.  

            Just pushed a prototype patch which I'm going to use to discsuss Netlink API with other developers. It does sort of work with just md_stats but more is needed.

            simmonsja James A Simmons added a comment - Just pushed a prototype patch which I'm going to use to discsuss Netlink API with other developers. It does sort of work with just md_stats but more is needed.

            James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/34256
            Subject: LU-11850 obd: use netlink to get lustre stats
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 6c74e4ed15ad654b4a20925bd36b8cc0e014d34c

            gerrit Gerrit Updater added a comment - James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/34256 Subject: LU-11850 obd: use netlink to get lustre stats Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 6c74e4ed15ad654b4a20925bd36b8cc0e014d34c

            I managed to get the basics working using netlink with obd stats. Just need to figure out how to link into the ptlrpc service.

            simmonsja James A Simmons added a comment - I managed to get the basics working using netlink with obd stats. Just need to figure out how to link into the ptlrpc service.
            simmonsja James A Simmons added a comment - - edited

            The kernel has rules about what can be in sysfs. An excellent article covering these rules is here:

            https://lwn.net/Articles/378884

            Since Lustre has complex data files they are not allowed in sysfs. So the quick fix done for the linux client was moving it to debugfs . The point of this policy was due to proc becoming a dumpster. Now the dumpster is debugfs  Note I have been avoiding the move of several files like stats to debugfs for the OpenSFS tree.

            No fear netlink will resolve these issues. I have a prototypes partially working. I just need to work out the nesting of data. I see its the ptlrpc service stats.

            simmonsja James A Simmons added a comment - - edited The kernel has rules about what can be in sysfs. An excellent article covering these rules is here: https://lwn.net/Articles/378884 Since Lustre has complex data files they are not allowed in sysfs. So the quick fix done for the linux client was moving it to debugfs . The point of this policy was due to proc becoming a dumpster. Now the dumpster is debugfs  Note I have been avoiding the move of several files like stats to debugfs for the OpenSFS tree. No fear netlink will resolve these issues. I have a prototypes partially working. I just need to work out the nesting of data. I see its the ptlrpc service stats.

            But why are we considering /sys/kenrel/debug/lustre/ost/... part of "debugging"

             

            mhanafi Mahmoud Hanafi added a comment - But why are we considering /sys/kenrel/debug/lustre/ost/... part of "debugging"  

            People

              simmonsja James A Simmons
              mhanafi Mahmoud Hanafi
              Votes:
              1 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated: