[LU-5641] sanity.sh test_103a "acl test" fails on el7 Created: 18/Sep/14  Updated: 07/Dec/21  Resolved: 12/Jan/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Critical
Reporter: Bob Glossman (Inactive) Assignee: Bob Glossman (Inactive)
Resolution: Fixed Votes: 0
Labels: MB
Environment:

el7


Issue Links:
Related
is related to LU-5689 need correct group definitions in el7... Closed
is related to LU-15259 SLES15.2 sanity test_103a test_125 te... Resolved
Severity: 3
Rank (Obsolete): 15802

 Description   

Test 103a fails when the client node is running el7. some of the permissions subtests of 103a assume the user 'daemon' is a member of the group 'bin'. In a default install of el7 this isn't true. If I manually add 'daemon' as a member of the group 'bin' by editing the /etc/group file on the client node(s) for the test then test 103a passes 100%.

I think it may be a TEI issue to ensure that the expected, assumed user/group setup is done on el7 test installs.



 Comments   
Comment by Andreas Dilger [ 19/Sep/14 ]

It would be nice to have an actual error message in the bug to search for.

Comment by Bob Glossman (Inactive) [ 19/Sep/14 ]

errors seen:

  .
  .
  .
performing permissions...
[12] $ id -u -- ok
  .
  .
  .
[52] $ su daemon -- ok
[53] $ cat f -- failed
root                                  ? cat: f: Permission denied              
bin                                   ? ~                                      
[57] $ echo daemon >> f -- ok
  .
  .
  .
[173] $ setfacl -m g:bin:r,g:daemon:w f -- ok
[175] $ su daemon -- ok
[176] $ : < f -- failed
~                                     ? f: Permission denied                   
[177] $ : > f -- ok
[178] $ : <> f -- ok
  .
  .
  .
[283] $ cd .. -- ok
[284] $ rm -rf d -- ok
101 commands (99 passed, 2 failed)
 sanity test_103a: @@@@@@ FAIL: permissions failed 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:4571:error_noexit()
  = /usr/lib64/lustre/tests/test-framework.sh:4602:error()
  = /usr/lib64/lustre/tests/sanity.sh:6874:test_103a()
  = /usr/lib64/lustre/tests/test-framework.sh:4849:run_one()
  = /usr/lib64/lustre/tests/test-framework.sh:4884:run_one_logged()
  = /usr/lib64/lustre/tests/test-framework.sh:4703:run_test()
  = /usr/lib64/lustre/tests/sanity.sh:6910:main()
Dumping lctl log to /tmp/test_logs/2014-09-19/111136/sanity.test_103a.*.1411150356.log
FAIL 103a (14s)
resend_count is set to 4 4
resend_count is set to 4 4
resend_count is set to 4 4
resend_count is set to 4 4
resend_count is set to 4 4
== sanity test complete, duration 21 sec ============================================================= 11:12:38 (1411150358)
sanity: FAIL: test_103a permissions failed
Comment by Andreas Dilger [ 19/Sep/14 ]

The problem is that lustre/tests/acl/{misc,permission}.test expect that daemon is in the bin group, and that is not easily changed in the test itself (the requirements are described at the top of these files). On my RHEL6 /etc/group the system accounts look like:

bin:x:1:root,bin,daemon
daemon:x:2:root,bin,daemon
sys:x:3:root,bin,adm
adm:x:4:root,adm,daemon

It would be useful to see what the /etc/group file looks like for RHEL7, so that it might be possible to just replace "bin" and/or "daemon" in the expect output and it will continue to work for both RHEL6 and RHEL7 and SLES11.

Probably the best option is to download the "acl" source RPM from RHEL7 to get an updated copy of the scripts and expect output from the "test" directory. It might be necessary to put this into a separate subdirectory (e.g. lustre/tests/acl-rhel7) but would probably improve our test coverage.

Comment by Bob Glossman (Inactive) [ 19/Sep/14 ]

yes, I said that was the problem in my initial description. by default in el7 /etc/group user 'daemon' is not a member of group 'bin'.

There's another option. the cmd 'gpasswd -a daemon bin' can be used to add user 'daemon' to group 'bin' sometime before running the test. Doing so would meke the environment conform to the expectations of the test. This cmd is available everywhere, in all distros and versions that I have polled. It is safe to use as it appears to be a noop when 'daemon' is already a member of the group. The only problem I see is who does this and on what nodes. I had suggested something like this be done during provisioning of el7 nodes in the TEI framework. Seems comparable to operations they already do like adding extra user & group entries into our test environment.

Comment by Bob Glossman (Inactive) [ 24/Sep/14 ]

proposed fix:
http://review.whamcloud.com/12044

Comment by Andreas Dilger [ 05/Nov/14 ]

I looks like sanity.sh test_103a passed for RHEL 7 according to:
https://testing.hpdd.intel.com/test_sets/bc4bc48a-4f69-11e4-9892-5254006e85c2

Comment by Bob Glossman (Inactive) [ 18/Nov/14 ]

need to reopen this to do more fixes for el7 server. the previous fix for this problem only fixed it for el7 clients.

Comment by Gerrit Updater [ 18/Nov/14 ]

Bob Glossman (bob.glossman@intel.com) uploaded a new patch: http://review.whamcloud.com/12762
Subject: LU-5641 tests: ensure user daemon is in group bin on mds
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: bfcc1c2f167427688f645c1a26f0abe2d036a5eb

Comment by Bob Glossman (Inactive) [ 18/Nov/14 ]

additional fix for el7 servers:
http://review.whamcloud.com/12762

Comment by Gerrit Updater [ 23/Nov/14 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12762/
Subject: LU-5641 tests: ensure user daemon is in group bin on mds
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 31832fab6d6da74e627133139b0c59c1dc07e62c

Comment by Jodi Levi (Inactive) [ 12/Jan/15 ]

Patch landed to Master.

Generated at Sat Feb 10 01:53:15 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.