[LU-9067] lctl dl command fails on el6 Created: 30/Jan/17  Updated: 01/Mar/17  Resolved: 01/Mar/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.10.0

Type: Bug Priority: Minor
Reporter: Bob Glossman (Inactive) Assignee: James A Simmons
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-8066 Move lustre procfs handling to sysfs ... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This problem has been seen on el6.8. The command 'lctl dl' fails.

This appears to be due to the "devices" entry used by the command being missing.
In older e6.x versions it is /proc/fs/lustre/devices.
In some newer distros, for example el7.3, it is /sys/kernel/debug/lustre/devices.

There isn't any lustre "devices" file anywhere in /proc or /sys. Don't know exactly why not.



 Comments   
Comment by Andreas Dilger [ 30/Jan/17 ]

What version of Lustre is this? It must either be the upstream kernel client, or something from recent master, for it to be in /sys/*, but they are using an old (pre 2.8) version of lctl. That was added in patch http://review.whamcloud.com/17468 "LU-5030 util: migrate liblustreapi to use cfs_get_paths()", which was landed as commit v2_7_65_0-23-g8813fdf.

Comment by Bob Glossman (Inactive) [ 30/Jan/17 ]

this is the current tip of master. tag 2.9.52

The lctl command is definitely looking in all the new locations, including /sys.
The problem is that in el6.8 there doesn't seem to be any "devices" file anywhere.

Comment by Andreas Dilger [ 30/Jan/17 ]

Could be fallout from patch https://review.whamcloud.com/23427 "LU-8066 obd: Add debugfs root" since part of the description is:

Move /proc/fs/lustre/devices to debugfs. The devices file prints out
status information about all obd devices in the system in human
readable form.

Comment by Bob Glossman (Inactive) [ 30/Jan/17 ]

Andreas,
It could very well be true that https://review.whamcloud.com/23427 breaks something. don't understand though why it works right in el7 but not in el6

Comment by Bob Glossman (Inactive) [ 30/Jan/17 ]

failure isn't only on prereleases
Seen with latest master on el6.8 too.
Strongly suspect it's broken on any el6, making it a pretty serious regression.

Same symptom, no lustre "devices" file created anywhere in /sys or /proc.

Comment by Bob Glossman (Inactive) [ 30/Jan/17 ]

fwiw on failing systems there is a /sys/kernel/debug dir, there isn't any /sys/kernel/debug/lustre dir. Don't know if that's a useful clue or not.

Comment by Dmitry Eremin (Inactive) [ 31/Jan/17 ]

Bob,
probably the debugfs is not mounted by default on RHEL 6.x. So, please use the following command to mount it.

mount -t debugfs none /sys/kernel/debug

You can add an equivalent /etc/fstab line to automatically mount it.

Comment by Bob Glossman (Inactive) [ 31/Jan/17 ]

Dmitry,
You are correct. debugfs isn't mounted by default on el6. doing the manual mount command you suggest fixes the problem.

This begs some questions though:
1) debugfs is mounted by default in el7. However it isn't in /etc/fstab there. How is it done in that case? Should el6 do the same?

2) if mount of debugfs in now a requirement for lustre to operate correctly how do we ensure that it is always done on all distros?

Comment by Andreas Dilger [ 31/Jan/17 ]

I agree with Bob - if this changes the behavior out of the box then it will be problematic for users. Either we need to automatically mount debugfs in lctl if "dl" (or other commands that need debugfs content) is used, revert the patch moving this over to debugfs until the problem is fixed, or figure out some other way to handle this. I'm surprised that this didn't cause any test failures during landing, but I guess that means there is no test that checks the output of "lctl dl" (yet).

In ancient days we used to get the "dl" content via ioctl(), but for large filesystem that caused problems due to the size of the output.

Comment by James A Simmons [ 31/Jan/17 ]

Ugh. I would suggest that we call mount() in lctl.c to handle this. To many patches have already landed for this to be reverted. Let me patch it up. We might need to back port this to earlier lustre versions as well.

Comment by Joseph Gmitter (Inactive) [ 31/Jan/17 ]

Hi James,

Assigning to you per your commentary above.

Thanks.
Joe

Comment by Gerrit Updater [ 31/Jan/17 ]

James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/25182
Subject: LU-9067 utils: ensure debugfs is mounted
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8aac94c0e186bc55b8987df68343af0f56efa48d

Comment by Bob Glossman (Inactive) [ 31/Jan/17 ]

found the answer to my question 1) above.
On el7 automount of debugfs is done by systemd, and controlled by the config file /usr/lib/systemd/system/sys-kernel-debug.mount

For el6 the same thing could probably be done in an init.d file for debugfs

Comment by Bob Glossman (Inactive) [ 31/Jan/17 ]

James,
Not sure I like your solution. The patch seems to work, but by calling mount() directly debugfs isn't added to mtab, Therefore it doesn't show up as visibly mounted in the 'mount' command. It does appear in /proc/mounts.

Comment by Bob Glossman (Inactive) [ 31/Jan/17 ]

I note that on el6 mount for /proc is accomplished with a line in /etc/fstab.
Seems reasonable to do the same for debugfs.
Maybe a little applet done during install of the lustre.rpm? only needed for el6.

Comment by James A Simmons [ 31/Jan/17 ]

The init.d solution only handles node bring up. What happens if for some reason debugfs is umounted long after the node has been up? That is why I did my approach.

Comment by Bob Glossman (Inactive) [ 31/Jan/17 ]

afaik nothing prevents a root privileged user from unmounting /proc either.
My feeling is if a superuser shoots themselves in the foot it serves them right.

I just prefer an approach that is less intrusive than doing something extra on nearly every lfs or lctl invocation.

I'm thinking of maybe altering /etc/fstab at installation time, and doing it only on el6 and only if not already done. If a debugfs line is added then also do a 'mount debugfs' right then. It then is mounted at boot time ever after. No impact on lustre libs or utils. Just a suggestion.

Comment by James A Simmons [ 01/Feb/17 ]

True an admin can umount /proc either. Okay lets go with the special script at startup. I can test it on a RHEL6 client for you.

Comment by James A Simmons [ 02/Feb/17 ]

Bob have you come up with a boot script yet?

Comment by Bob Glossman (Inactive) [ 02/Feb/17 ]

James,
I thought you were going to do it. I only made a suggestion for an approach.

My thinking is no boot script is needed. Once /etc/fstab is modified debugfs would always get mounted at boot time by current existing scripts.

Comment by James A Simmons [ 02/Feb/17 ]

Okay. I misunderstood. I updated the lnet init.d startup script for RHEL6.8  to mount debugfs. Can you try the last version of my previous patch. 

Comment by Bob Glossman (Inactive) [ 02/Feb/17 ]

James,
Your change to the lnet script works, but I see 2 problems with it:

1) not everybody uses the lnet script to startup lustre
2) if a permanent change to /etc/fstab has been done by any method then debugfs is already mounted. This leads to errors like:

# service lnet start
mount: none already mounted or /sys/kernel/debug busy
mount: according to mtab, debugfs is already mounted on /sys/kernel/debug
LNET configured
Comment by James A Simmons [ 02/Feb/17 ]

Does anyone use any of those scripts? Its just we can not replace fstab when installing lustre. fstab can be very site specific. Somewhere some how debugfs has to be mounted. Suggestions besides the whole idea of nuking a sites fstab file? Also we need debugfs available in the case of routers which only will have lnet installed. Perhaps my libcfs code is the best option.

Comment by Bob Glossman (Inactive) [ 02/Feb/17 ]

I favor the idea of editing /etc/fstab on the fly at install time, adding a line for debugfs. Do it only on el6, do it only if such a line is not already there. This preserves any local site fstab edits or changes. Just not sure how to accomplish that.

I don't favor complete replacement of /etc/fstab ever.

Comment by James A Simmons [ 02/Feb/17 ]

So most people don't use the provided startup scripts that come with lustre?

Comment by Bob Glossman (Inactive) [ 02/Feb/17 ]

I believe actual practice in real installations varies quite a bit. I have seen many sites that don't use them at all. Personally I don't in my own test setups. I think they were initially done as examples, not as required must use features. They have existed for a long time.

afaik, the most common use of the lnet startup script is on routers. It's an easy way to get the needed kernel modules loaded reliably at boot time. On routers there are typically no mount or other lustre activities that would get modules loaded otherwise.

Comment by James A Simmons [ 02/Feb/17 ]

Looking at our own systems we manage fstab with puppet so any changes at install time will be stomped on soon after. I don't think modifying fstab is going to work. If people don't use the startup script then we are going to have to go with the libcfs library mounting debugfs for us. We just need to do it one time. 

Comment by Bob Glossman (Inactive) [ 02/Feb/17 ]

not sure what " manage fstab with puppet" means. if you have external methods to change and maintain fstab, how do you do other site specific changes, for example adding nfs client mounts? maybe in such cases a debugfs mount can be added by an admin.

Comment by James A Simmons [ 03/Feb/17 ]

Does SLES11 have this issue also? I see my Cray system its mounted but I wonder in general. I have an idea!!! What about calling sys_mount when the libcfs modules loads? We can make it conditional only for RHEL6 and that way it only would happen at module load. Does that sound reasonable?

Comment by Bob Glossman (Inactive) [ 03/Feb/17 ]

not an issue on SLES11 or SLES12. debugfs mounted there. el6 is the only context I can find where it's not mounted by default.

In sles11 it's mounted via fstab.
In sles12 it's mounted via systemd, just like el7.

Comment by Bob Glossman (Inactive) [ 03/Feb/17 ]

James,
Back to my original criticism of your original fix on your latest rev:

The patch seems to work, but by calling mount() directly debugfs isn't added to mtab Therefore it doesn't show up as visibly mounted in the 'mount' command. It does appear in /proc/mounts.

I see nothing that restricts it to el6 or happening only at module load time, as mentioned in your comment (above).

Comment by James A Simmons [ 03/Feb/17 ]

Newer distros symlink mtab to /proc/mounts but that is not the case for RHEL6. Luckly their is a function to add entries to mtab.

Comment by Bob Glossman (Inactive) [ 03/Feb/17 ]

yes, el6 is old school. Still maintains mtab as a real, separate file. Not linked to /proc/mounts. All distros used to be that way.

Comment by James A Simmons [ 06/Feb/17 ]

I updated the patch so if mtab is a real file it will add a debugfs entry.

Comment by Bob Glossman (Inactive) [ 27/Feb/17 ]

more on master:
https://testing.hpdd.intel.com/test_sets/ad39ea70-fc88-11e6-b542-5254006e85c2
https://testing.hpdd.intel.com/test_sets/06f1473c-fd17-11e6-a77a-5254006e85c2

I think this problem is blocking all el6 tests on master atm.
The only reason it isn't causing more fallout is the fact that el7 is the default test distro for master. Not a problem on el7.

Comment by James A Simmons [ 27/Feb/17 ]

Should be landing very soon

Comment by Gerrit Updater [ 01/Mar/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25182/
Subject: LU-9067 utils: ensure debugfs is mounted
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e53bbbc510f9ac96f2556131c405c7e5c749cc27

Comment by James A Simmons [ 01/Mar/17 ]

Patch has landed

Generated at Sat Feb 10 02:22:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.