Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11850

Relocating /proc/fs/lustre/ost to /sys/kernel/debug/lustre/ost prevents non-root access

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • Upstream
    • Lustre 2.12.0
    • 3
    • 9223372036854775807

    Description

      For security reasons /sys/kernel/debug is restrict to root only so by relocating /proc/fs/lustre/ost & mdt to /sys/kenrnel/debug/lustre breaks many tools such as 'performance co pilot" that run as non-privilege users. We rely on such tools to collect lustre metric.

      We could change the permissions on /sys/kernel/debug but that is not good security practice. Can there be a build option to selected the location?

      Attachments

        Issue Links

          Activity

            [LU-11850] Relocating /proc/fs/lustre/ost to /sys/kernel/debug/lustre/ost prevents non-root access

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/34256/
            Subject: LU-11850 obd: use netlink to get lustre stats
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 5803284ac3a5d477df9afffe48ff35f08d67da1a

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/34256/ Subject: LU-11850 obd: use netlink to get lustre stats Project: fs/lustre-release Branch: master Current Patch Set: Commit: 5803284ac3a5d477df9afffe48ff35f08d67da1a

            Hi James,

            we conducted some tests and it seems that still only root can access the stats. Please find attached a log.

            log_stats.txt

            jtacquaviva Jean-Thomas Acquaviva added a comment - Hi James, we conducted some tests and it seems that still only root can access the stats. Please find attached a log. log_stats.txt
            simmonsja James A Simmons added a comment - - edited

            Some time numbers for lctl get_param ..stats

            root - proc file : 

            real    0m0.002s
            user    0m0.000s
            sys     0m0.002s

            normal user (netlink):

            real    0m0.006s
            user    0m0.003s
            sys     0m0.003s

            simmonsja James A Simmons added a comment - - edited Some time numbers for lctl get_param . .stats root - proc file :  real    0m0.002s user    0m0.000s sys     0m0.002s normal user (netlink): real    0m0.006s user    0m0.003s sys     0m0.003s

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53994
            Subject: LU-11850 obd: debug failure
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: ef85e2655bf3d838970c7473c38849330802ec2c

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53994 Subject: LU-11850 obd: debug failure Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ef85e2655bf3d838970c7473c38849330802ec2c

            Almost done with the patch for stats. Only bug left is if you grab ALL stats it overflows the liblnetconfig library. I do want to move the internal storage of the stats structures as an Xarray instead of a generic_radix struct.

            simmonsja James A Simmons added a comment - Almost done with the patch for stats. Only bug left is if you grab ALL stats it overflows the liblnetconfig library. I do want to move the internal storage of the stats structures as an Xarray instead of a generic_radix struct.

            I updated patch 34256. Not perfect but is mostly works now. Give it a try. Note you will need the latest Lustre to make this work.

            simmonsja James A Simmons added a comment - I updated patch 34256. Not perfect but is mostly works now. Give it a try. Note you will need the latest Lustre to make this work.

            Hello. Since you brought this up I do have a patch - https://review.whamcloud.com/c/fs/lustre-release/+/34256. After reading this I tracked down the crash I was seeing which was due to a really large message. Stats can be really huge amount of data. I need to adjust the skb. I do need to do more testing for what global_match() can handle.

            simmonsja James A Simmons added a comment - Hello. Since you brought this up I do have a patch - https://review.whamcloud.com/c/fs/lustre-release/+/34256. After reading this I tracked down the crash I was seeing which was due to a really large message. Stats can be really huge amount of data. I need to adjust the skb. I do need to do more testing for what global_match() can handle.

            JT, the change of /sys/kernel/debug to root-only happened in the upstream kernel after Lustre started using it, so it would need a kernel patch on all clients (AFAIK), that I don't think anyone wants.

            I haven't if there is an easy way to restructure the code back to using /proc/fs/lustre to make these stats available again, but that would probably be the least disruptive code change. The other option would be to add a dedicated "lparamfs" to hold all the Lustre stats so we don't have to deal with the kernel restrictions at all.

            adilger Andreas Dilger added a comment - JT, the change of /sys/kernel/debug to root-only happened in the upstream kernel after Lustre started using it, so it would need a kernel patch on all clients (AFAIK), that I don't think anyone wants. I haven't if there is an easy way to restructure the code back to using /proc/fs/lustre to make these stats available again, but that would probably be the least disruptive code change. The other option would be to add a dedicated " lparamfs " to hold all the Lustre stats so we don't have to deal with the kernel restrictions at all.

            We would need to provide access to the statistics on the client side from non-root accesses. Such access is mandatory for many performance tools running in user space.

            Probably the patch is of modest size, is it possible to have a hot-fix?

            jtacquaviva Jean-Thomas Acquaviva added a comment - We would need to provide access to the statistics on the client side from non-root accesses. Such access is mandatory for many performance tools running in user space. Probably the patch is of modest size, is it possible to have a hot-fix?

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51959
            Subject: LU-11850 lov: migrate completely to lu_tgt_descs API
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8e49bf0a866c9214ac72bb85e2c49557615a3dd4

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51959 Subject: LU-11850 lov: migrate completely to lu_tgt_descs API Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8e49bf0a866c9214ac72bb85e2c49557615a3dd4

            So I have been doing research into the different stat collectors out their. From what I see you can configure them to collect the data from the lustre utilities instead of attempting to read from the debugfs files directly. For example for collectd you would use:

            <Plugin exec>

                Exec "myuser:mygroup" "myprog"

               Exec "otheruser" "/path/to/another/binary" "arg0" "arg1"

               NotificationExec "user" "/usr/lib/collectd/exec/handle_notification"

            </Plugin>

            Looking at LMT and performance co pilot it looks to be the same case. If we can get are utilities to work without root access we should be in good shape.

             

            simmonsja James A Simmons added a comment - So I have been doing research into the different stat collectors out their. From what I see you can configure them to collect the data from the lustre utilities instead of attempting to read from the debugfs files directly. For example for collectd you would use: <Plugin exec>     Exec "myuser:mygroup" "myprog"    Exec "otheruser" "/path/to/another/binary" "arg0" "arg1"    NotificationExec "user" "/usr/lib/collectd/exec/handle_notification" </Plugin> Looking at LMT and performance co pilot it looks to be the same case. If we can get are utilities to work without root access we should be in good shape.  

            People

              simmonsja James A Simmons
              mhanafi Mahmoud Hanafi
              Votes:
              1 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated: