[LU-11074] Invalid argument reading file caps Created: 07/Jun/18  Updated: 03/Aug/18  Resolved: 18/Jul/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.4
Fix Version/s: Lustre 2.12.0, Lustre 2.10.5

Type: Bug Priority: Minor
Reporter: SC Admin (Inactive) Assignee: John Hammond
Resolution: Fixed Votes: 0
Labels: None
Environment:

centos 7.5, x86_64, OPA, zfs 0.7.9


Issue Links:
Related
is related to LU-11107 getxattr() returns 0 length values fo... Resolved
is related to LU-11123 LustreError in ll_xattr_list() server... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

2.10.4 client seems to have introduced a regression from 2.10.3.

we now see this message from clients

Jun  7 06:33:32 john73 kernel: Invalid argument reading file caps for /home/fstars/dwf_prepipe/dwf_prepipe_processccd.py
Jun  7 10:55:40 bryan8 kernel: Invalid argument reading file caps for /bin/date
Jun  7 11:05:29 john75 kernel: Invalid argument reading file caps for /usr/bin/basename
Jun  7 11:51:29 john97 kernel: Invalid argument reading file caps for /usr/bin/id
Jun  7 11:51:29 john97 kernel: Invalid argument reading file caps for /apps/lmod/lmod/lmod/libexec/addto

the upshot of which is that those files then can't be exec'd by the kernel.

all our servers are now centos 7.4 and 2.10.4 + LU10988 lfsck patch, zfs 0.7.9.
we have 4 lustre filesystems in the cluster and this 'fail caps' issue happens on them all. more on the root filesystem because there are more exe's there.

for some files it seems to happen on all clients and be persistent eg. all the 2.10.4 client nodes see this

[root@john72 ~]# g++
-bash: /usr/bin/g++: Invalid argument
[root@john72 ~]# dmesg | tail -1
[616489.562465] Invalid argument reading file caps for /usr/bin/g++

and for other files it's transient. eg. the exe's on the nodes listed above all work again now

[root@john97 ~]# /usr/bin/id
uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel)

g++ is interesting because it's hard-linked 4 times (to c+, ...), which might be part of why it persists? copying each of c, g+. etc. to a separate (non-hardlinked) file is a workaround and lets it be exec'd again, but that doesn't explain all the other files that sometimes work and sometimes don't.

apart from things like g++, the problem is rare, less than once per client per day.

as a workaround (so we can get all clients onto the more secure centos7.5) we'd like to run 2.10.3 on centos7.5 for a while, but it doesn't seem to work (looks to mount, but then ls says 'not a directory'). I don't suppose there's a patch or two that'll let 2.10.3 be functional on centos7.5? thanks.

cheers,
robin



 Comments   
Comment by John Hammond [ 07/Jun/18 ]

Hi Robin,

Are you using any Linux Security Modules? Could you enable full debugging, clear the debug log, reproduce this, dump the log and attach? (You may need to increase the debug_mb parameter to get a full capture.)

Comment by SC Admin (Inactive) [ 07/Jun/18 ]

Hey John,

no, not using any LSM.

I'll gather the debug for eg. g++ when a node clears of jobs. otherwise there'll be lots of noise.

cheers,
robin

Comment by John Hammond [ 08/Jun/18 ]

Which 7.5 kernel are you using?

Comment by Peter Jones [ 08/Jun/18 ]

Robin

Any idea how long it will take to get the debug logs?

Peter

Comment by SC Admin (Inactive) [ 08/Jun/18 ]

we're using 862.3.2 kernel, the latest AFAIK.

I'm being hesitant about debug logs 'cos I'm not 100% convinced it's a lustre bug. we definitely don't see this issue with rhel7.4 + 2.10.3, but the complication is that we use overlayfs over our root lustre filesystem.

overlayfs changed a lot between 7.4 and 7.5 and I've re-patched it etc, but it might still be an overlayfs bug, or an overlayfs interaction with lustre that's now different vs. a pure lustre bug.

the thing that indicates it's maybe a real lustre issue is that we see the 'file caps' problem on all filesystems - /home, /apps, /fred(dagg) - and not just on /images (which is the only one with overlayfs over it).

AFAIK the only thing these 4 filesystems share is the root inode, which is on overlayfs. it seems really unlikely that the node is healthy for all accesses via the root inode/dentry, and at the same time sees 'file caps' fail on one of the pure lustre filesystems, but I wanted to try a few things first. eg. patch the rhel 7.5 kernel with a bunch of stable capabilities namespace backports that rhel seem to have omitted... unfortunately that didn't fix it.

the g++ 'file caps' bug (the one that's trivial to reproduce) doesn't happen if I go directly to lustre, so there's definitely something wrong with overlayfs. I was sure I'd tried this before making this bug report, but I guess not.

however, g++ failing via overlayfs and working via lustre doesn't explain the much rarer fails direct to lustre on the other 3 filesystems (+/- that shared root inode). but I can't reproduce those at will - they are rare. so I don't see how I can get you a debug trace for those.

I can't figure out from 'git log v2_10_3..v2_10_4' on b2_10 which patch(es) make the lustre client work with rhel7.5's kernel. if there is one or two that you can point me at then that would help.
this is because if 2.10.3 is busted with rhel7.5 too, then that means it's a rhel7.5 kernel issue and nothing to do with lustre.

cheers,
robin

Comment by Andreas Dilger [ 11/Jun/18 ]

If you can't find which patch is the source of the problem, I'd suggest to use git bisect with your "good" reproducer (possibly run multiple times to ensure you don't get a false pass) to isolate the issue to a single patch. That will allow us to identify which patch introduced the problem and possibly see how it is interacting badly with overlayfs.

Comment by Peter Jones [ 11/Jun/18 ]

Lai

Can you please investigate?

Thanks

Peter

Comment by Peter Jones [ 11/Jun/18 ]

Sorry - Lai, I intended that comment for another ticket

Comment by SC Admin (Inactive) [ 11/Jun/18 ]

Hi,

thanks for the activity on the bug, it is much appreciated. but unless you have a solid suspicion of what's wrong, then please don't work on this for now.

I built 2.10.4 for centos7.4 on the weekend and have been rebooting clients into it since.

hopefully I can work out from that if 'file caps' is a lustre 2.10.4 issue or a rhel7.5 kernel + overlayfs issue.

sorry, I should have thought of doing that before...

cheers,
robin

Comment by SC Admin (Inactive) [ 27/Jun/18 ]

Hi,

I've finally had some time to look into this again. seems there's a regression with Lustre on the rhel/centos 7.5 kernel.

the rhel/centos 7.4 kernel is fine, but the 7.5 kernel breaks Lustre when getting file capabilities from files with lots of hard links.

a reproducer is:

# echo blah > a
# getcap a
# for f in {b..f}; do ln a $f; done
# getcap a
Failed to get capabilities of file `a' (Invalid argument)
# cat /sys/fs/lustre/version 
2.10.4
# uname -a
Linux john5 3.10.0-862.3.3.el7.x86_64 #1 SMP Fri Jun 15 04:15:27 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

our 'real world' example is a g++ exe on Lustre with 4 hard links which always fails 'getcap', but the above reproducer (on a different Lustre fs with more MDTs) required more than 4 hard links to see the same problem.

I went out to >200 hard links with the same example as above with Lustre 2.10.4 and centos 7.4 kernel, and it was fine.

cheers,
robin

Comment by SC Admin (Inactive) [ 28/Jun/18 ]

Hi,

in case it wasn't clear, there's no overlayfs involved in the above reproducer at all - only Lustre. the node was booted into a server ramdisk image to do the testing.

the reproducer is super-simple, but please let me know if you want me to gather debug logs from eg. 7.4 kernel + 2.10.4 and 7.5 kernel + 2.10.4 anyway. not hard for me to do.

cheers,
robin

Comment by John Hammond [ 28/Jun/18 ]

Hi Robin,

OK, thank you for your reproducer. It's reproducing the issue for me as well. There appear to a few bugs here. I have a fix for one of them at https://review.whamcloud.com/32739. I believe this change will give you a workaround for the file caps issue. I am testing it locally now as well as looking at fixes for the other bugs.

Comment by SC Admin (Inactive) [ 29/Jun/18 ]

Hi John,

yeah, that seems to work for g++ with 862.3.3 kernel. thanks. nicely done

I'll roll it out onto a few nodes and keep and eye on them and see if it's also fixed the sporadic 'file caps' failures we were seeing.

cheers,
robin

Comment by SC Admin (Inactive) [ 30/Jun/18 ]

Hi John,

after booting a few nodes into this, I'm still seeing the occasional 'file caps' failure so yeah, you're right - there's more bugs in this area somewhere.

cheers,
robin

Comment by John Hammond [ 02/Jul/18 ]

Yes, I believe that LU-11107 is the real issue. https://review.whamcloud.com/32739 should just reduce your chances of hitting it.

Comment by Gerrit Updater [ 18/Jul/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32739/
Subject: LU-11074 mdc: set correct body eadatasize for getxattr()
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: dea1cde92014545d97406bf8adba20840abdb1a9

Comment by Peter Jones [ 18/Jul/18 ]

Landed for 2.12

Comment by SC Admin (Inactive) [ 30/Jul/18 ]

just to follow up, this and LU-11107 have fixed the issue for us.
thanks!

cheers,
robin

Comment by Gerrit Updater [ 30/Jul/18 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/32901
Subject: LU-11074 mdc: set correct body eadatasize for getxattr()
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: c8bf7d0fb95618a06a493228707cd1e830da78f8

Comment by Gerrit Updater [ 03/Aug/18 ]

John L. Hammond (jhammond@whamcloud.com) merged in patch https://review.whamcloud.com/32901/
Subject: LU-11074 mdc: set correct body eadatasize for getxattr()
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: f99f9345e46b5b19a8dca2aae4d348c99d8e2481

Generated at Sat Feb 10 02:40:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.