[LU-11594] sanity test_103a: FAIL: permissions failed Created: 01/Nov/18  Updated: 26/Aug/19  Resolved: 26/Aug/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Upstream, Lustre 2.12.0, Lustre 2.12.1, Lustre 2.12.2, Lustre 2.12.3
Fix Version/s: Lustre 2.13.0

Type: Bug Priority: Major
Reporter: Jian Yu Assignee: James A Simmons
Resolution: Fixed Votes: 1
Labels: arm, sles12, ubuntu16
Environment:

Lustre Build: https://build.whamcloud.com/job/lustre-master/3811
Distro/Arch: RHEL7.5/aarch64 (client), RHEL7.5/x86_64 (server)


Issue Links:
Related
is related to LU-10334 Ubuntu1604 client sanity-103a: FAIL: ... Resolved
is related to LU-12269 Support RHEL 8.0 Resolved
is related to LU-12657 sanity/103 and sanityn/25 fail with 4... Resolved
is related to LU-12511 Prepare lustre for adoption into the ... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

sanity test 103a failed as follows:

performing permissions...
[12] $ id -u -- ok
[19] $ mkdir d -- ok
[20] $ cd d -- ok
[21] $ umask 027 -- ok
[22] $ touch f -- ok
[23] $ ls -l f | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok
[30] $ echo root > f -- ok
[32] $ su daemon -- ok
[33] $ echo daemon >> f -- ok
[36] $ su -- ok
[42] $ chown bin:bin f -- ok
[43] $ ls -l f | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok
[45] $ su bin -- ok
[46] $ echo bin >> f -- ok
[52] $ su daemon -- ok
[53] $ cat f -- ok
[57] $ echo daemon >> f -- ok
[64] $ su bin -- ok
[65] $ setfacl -m u:daemon:rw f -- ok
[66] $ getfacl --omit-header f -- ok
[77] $ su daemon -- ok
[78] $ echo daemon >> f -- ok
[79] $ cat f -- ok
[88] $ su bin -- ok
[89] $ chmod g-w f -- ok
[90] $ getfacl --omit-header f -- ok
[98] $ su daemon -- ok
[99] $ echo daemon >> f -- ok
[108] $ su bin -- ok
[109] $ setfacl -m u:daemon:r,g:daemon:rw-,o::rw- f -- ok
[111] $ su daemon -- ok
[112] $ echo daemon >> f -- ok
[119] $ su bin -- ok
[120] $ setfacl -x u:daemon f -- ok
[122] $ su daemon -- ok
[123] $ echo daemon2 >> f -- ok
[124] $ cat f -- ok
[134] $ su bin -- ok
[135] $ setfacl -m g:daemon:r f -- ok
[137] $ su daemon -- ok
[138] $ echo daemon3 >> f -- ok
[145] $ su bin -- ok
[146] $ setfacl -x g:daemon f -- ok
[148] $ su daemon -- ok
[149] $ echo daemon4 >> f -- ok
[156] $ su -- ok
[157] $ chgrp root f -- ok
[159] $ su daemon -- ok
[160] $ echo daemon5 >> f -- ok
[161] $ cat f -- ok
[172] $ su -- ok
[173] $ setfacl -m g:bin:r,g:daemon:w f -- ok
[175] $ su daemon -- ok
[176] $ : < f -- ok
[177] $ : > f -- ok
[178] $ : <> f -- ok
[186] $ su -- ok
[187] $ mkdir -m 750 e -- ok
[188] $ touch e/h -- ok
[190] $ su bin -- ok
[191] $ shopt -s nullglob ; echo e/* -- ok
[194] $ echo i > e/i -- ok
[197] $ su -- ok
[198] $ setfacl -m u:bin:rx e -- ok
[200] $ su bin -- ok
[201] $ echo e/* -- failed
e/h                                   ? e/*                                    
[208] $ touch e/i 2>&1 | sed -e "s/touch .*e\/i.*:/touch \'e\/i\':/" -- ok
[211] $ su -- ok
[212] $ setfacl -m u:bin:rwx e -- ok
[214] $ su bin -- ok
[215] $ echo i > e/i -- failed
~                                     ? e/i: Permission denied                 
[220] $ su -- ok
[221] $ touch g -- ok
[222] $ ln -s g l -- ok
[223] $ setfacl -m u:bin:rw l -- ok
[224] $ ls -l g | awk -- '{ print $1, $3, $4 }' -- ok
[234] $ mknod -m 0660 hdt b 91 64 -- ok
[235] $ mknod -m 0660 null c 1 3 -- ok
[236] $ mkfifo -m 0660 fifo -- ok
[238] $ su bin -- ok
[239] $ : < hdt -- ok
[241] $ : < null -- ok
[243] $ : < fifo -- ok
[246] $ su -- ok
[247] $ setfacl -m u:bin:rw hdt null fifo -- ok
[249] $ su bin -- ok
[250] $ : < hdt -- failed
hdt: No such device or address        ? hdt: Permission denied                 
[252] $ : < null -- failed
~                                     ? null: Permission denied                
[253] $ ( echo blah > fifo & ) ; cat fifo -- failed
blah                                  ? fifo: Permission denied                
~                                     ? cat: fifo: Permission denied           
[261] $ su -- ok
[262] $ mkdir -m 600 x -- ok
[263] $ chown daemon:daemon x -- ok
[264] $ echo j > x/j -- ok
[265] $ ls -l x/j | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok
[268] $ setfacl -m u:daemon:r x -- ok
[270] $ ls -l x/j | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok
[274] $ echo k > x/k -- ok
[277] $ chmod 750 x -- ok
[282] $ su -- ok
[283] $ cd .. -- ok
[284] $ rm -rf d -- ok
101 commands (96 passed, 5 failed)
 sanity test_103a: @@@@@@ FAIL: permissions failed 

Maloo report: https://testing.whamcloud.com/test_sets/88bbf5c2-d9d0-11e8-b46b-52540065bddc



 Comments   
Comment by Jian Yu [ 01/Nov/18 ]

sanityn test 25a failed with similar issue:

== sanityn test 25a: change ACL on one mountpoint be seen on another ================================= 06:11:56 (1540534316)
running as uid/gid/euid/egid 500/500/500/500, groups:
 [checkstat] [-v] [/mnt/lustre2/d25a.sanityn/f1]
running as uid/gid/euid/egid 500/500/500/500, groups:
 [checkstat] [-v] [/mnt/lustre2/d25a.sanityn/f1]
 sanityn test_25a: @@@@@@ FAIL: checkstat /mnt/lustre2/d25a.sanityn/f1 #2 

Maloo report: https://testing.whamcloud.com/test_sets/8b2f4282-d9d0-11e8-b46b-52540065bddc

Comment by Peter Jones [ 01/Nov/18 ]

John

Could you please investigate?

Thanks

Peter

Comment by James A Simmons [ 13/Nov/18 ]

This same bug exist for linux lustre client as well as Ubuntu18

Comment by Gerrit Updater [ 14/Nov/18 ]

Jian Yu (yujian@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33654
Subject: LU-11594 tests: disable sanity test 103a for ARM
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 92a287c5fc315c4f8b16bf15dd4c6141838e0032

Comment by James Nunez (Inactive) [ 15/Dec/18 ]

There's a comment that sanityn test 25a is suffering from the same problem. Should we disable sanityn tests 25a and 25b also?

https://testing.whamcloud.com/test_sets/f1121dd4-fdef-11e8-b837-52540065bddc

Comment by Jian Yu [ 11/Feb/19 ]

The same failure occurred on SLES12 SP4 (kernel version 4.12.14-95.6.1) client:
https://testing.whamcloud.com/test_sets/853ddbdc-2cb8-11e9-a994-52540065bddc (sanity test 103a)
https://testing.whamcloud.com/test_sets/8c01d450-2cb8-11e9-a994-52540065bddc (sanityn test 25a and 25b)

Comment by Sebastien Buisson [ 07/May/19 ]

FYI, Maloo may refer to this ticket in case of failure of sanity test_103a when SELinux is enabled on the client. But in the case of SELinux this is a different problem, in fact due to an issue with the test itself, that I am going to investigate under LU-12267.

Comment by James A Simmons [ 07/May/19 ]

Ubuntu16 sees also an error with this test but its slightly different reason for failure.

Comment by Chris Horn [ 13/May/19 ]

I'm hitting this with SLES15 client/server.

Comment by James A Simmons [ 16/May/19 ]

Peter I can take this ticket since it impacts any newer kernels. Also I think we will be seeing this fix with LU-12267 patch.

Comment by Chris Horn [ 16/May/19 ]

James, FYI I just tried LU-12267 patch and test_103a still fails for me.

Comment by Jian Yu [ 06/Jun/19 ]

The same failure occurred on RHEL 8.0 (kernel version 4.18.0-80.el8.x86_64) client:
https://testing.whamcloud.com/test_sets/6a330bd4-8837-11e9-be83-52540065bddc

Comment by Peter Jones [ 09/Aug/19 ]

@James is this still something that you intend to work on?

Comment by James A Simmons [ 09/Aug/19 ]

Yes. Sorry I have been busy with other things. I did look at this before and it might be a big change since the way most file systems handle ACL handling is with by using the default ACL handler the kernel uses which is wrapped around set_acl(), get_acl(). The default ACL handler in newer kernels most likely handle all the permission issues for us. I need to look into it in more detail.

Comment by James A Simmons [ 11/Aug/19 ]

LU-12657 has a potential fix

Comment by Gerrit Updater [ 21/Aug/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34819/
Subject: LU-11594 test: re-enable sanity test 103a for ARM
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 12d7b7af5e397368c32bc4b82609e37afa0c0a26

Comment by Andreas Dilger [ 26/Aug/19 ]

We shouldn't be resolving issues with the always_except label still on them. As in this case, that often means that the subtest was not removed from the ALWAYS_EXCEPT list in the test.

Generated at Sat Feb 10 02:45:14 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.