[LU-12244] lfs check subcommand no longer works as non-root user Created: 29/Apr/19 Updated: 15/Jul/19 Resolved: 03/Jul/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Cameron Harr | Assignee: | Peter Jones |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | llnl | ||
| Environment: |
Client/Server: lustre-2.10.6_2.chaos-1.ch6.x86_64 |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
With the Lustre 2.10 client, non-root users appear to no longer be able to run "lfs check <servers|osts|mds>". Strace of the command shows a "Permission denied" error accessing /sys/kernel/debug/lustre/devices, resulting in the following error at runtime: error: check: mds status failed Our Operations staff w/o root access needs "lfs check ..." functionality to monitor and fix file system issues, so fixing this issue would helpful for us. statfs("/sys/kernel/debug/", {f_type=DEBUGFS_MAGIC, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_RELATIME}) = 0
stat("/sys/fs/lnet/devices", 0x7fffffff8760) = -1 ENOENT (No such file or directory)
stat("/sys/fs/lustre/devices", 0x7fffffff8760) = -1 ENOENT (No such file or directory)
stat("/sys/kernel/debug/lnet/devices", 0x7fffffff8740) = -1 EACCES (Permission denied)
stat("/sys/kernel/debug/lustre/devices", 0x7fffffff8740) = -1 EACCES (Permission denied)
stat("/proc/fs/lnet/devices", 0x7fffffff8760) = -1 ENOENT (No such file or directory)
stat("/proc/fs/lustre/devices", 0x7fffffff8760) = -1 ENOENT (No such file or directory)
stat("/proc/sys/lnet/devices", 0x7fffffff8760) = -1 ENOENT (No such file or directory)
stat("/proc/sys/lustre/devices", 0x7fffffff8760) = -1 ENOENT (No such file or directory)
write(2, "error: check: mds status failed\n", 32error: check: mds status failed
) = 32
exit_group(2) = ?
+++ exited with 2 +++
|
| Comments |
| Comment by James A Simmons [ 30/Apr/19 ] |
|
We have fixes for that which landed to newer lustre versions. You need patch: |
| Comment by Andreas Dilger [ 30/Apr/19 ] |
|
The move of the /proc/fs/lustre/devices file from procfs to debugfs was done as part of patch https://review.whamcloud.com/23428 "LU-8066 obdclass: move lustre sysctl to sysfs" landed for 2.9.56, so it has been in all 2.10 releases. I suspect the reason it is a problem now is that RedHat has backported a change from newer kernels to their kernel that makes debugfs root-only. |
| Comment by Andreas Dilger [ 30/Apr/19 ] |
|
I've cherry-picked the patch to b2_10: |
| Comment by Cameron Harr [ 30/Apr/19 ] |
|
Thank you both. My search for related tickets missed LU-11850, which is very similar. Sparing Redhat from some blame, we only recently started rolling out 2.10 clients (from 2.8) so this is only affecting us now. |
| Comment by James A Simmons [ 01/May/19 ] |
|
LU-11850 only impacts 2.12 LTS users. |
| Comment by Cameron Harr [ 01/May/19 ] |
|
... And I had searched only on 2.10. Thanks. |
| Comment by Andreas Dilger [ 01/May/19 ] |
|
Cameron, did you try out the patch for b2_10? Did it solve your problem? It only affects the userspace tools on the client, so you wouldn't need to upgrade all of the kernel modules or take an outage to install it. |
| Comment by Olaf Faaland [ 01/May/19 ] |
|
Andreas, we haven't tried it, but we can do so today and post back. |
| Comment by Olaf Faaland [ 01/May/19 ] |
|
Yep, that patch worked without any other patches or modification. |
| Comment by Olaf Faaland [ 01/May/19 ] |
|
For our own recordkeeping, our local ticket: TOSS-4503 |
| Comment by Peter Jones [ 03/Jun/19 ] |
|
I believe that this issue is fixed in both 2.10.8 and 2.12.2 so the ticket can now be considered RESOLVED |
| Comment by Olaf Faaland [ 10/Jun/19 ] |
|
Peter, [faaland1@hefe branch:2.10.6-llnl lustre-210] $git lg wcrev/b2_12 | grep LU-8066 | grep iterate [faaland1@hefe branch:2.10.6-llnl lustre-210] $git lg wcrev/b2_10 | grep LU-8066 | grep iterate * e55cd4f LU-8066 utils: have llapi_target_iterate use sysfs tree https://review.whamcloud.com/#/c/34781/ Has a +2 but has not landed due to undiagnosed test failures, last activity May 8. |
| Comment by Peter Jones [ 10/Jun/19 ] |
|
You are correct - sorry. |
| Comment by Peter Jones [ 01/Jul/19 ] |
|
I think that once the current b2_12-next branch lands that this really will be complete. |
| Comment by Peter Jones [ 03/Jul/19 ] |
|
Ok - now this is fixed on b2_12 |