[LU-14294] parallel-scale-nfsv4 fails to start with “setup nfs failed! “ for RHEL8.3 Created: 05/Jan/21 Updated: 02/Aug/23 Resolved: 19/May/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0, Lustre 2.15.1, Lustre 2.15.3 |
| Fix Version/s: | Lustre 2.16.0, Lustre 2.15.4 |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | Alex Deiter |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | rhel8, rhel8.3 | ||
| Environment: |
RHEL8.3 server |
||
| Issue Links: |
|
||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||
| Description |
|
The parallel-scale-nfsv4 test suite is failing in NFS setup and, thus, no tests are run. We are seeing this for RHEL8.3 servers. Looking at a recent failure at https://testing.whamcloud.com/test_sets/d76032dc-6074-406f-824c-a7f3676496cb, we see CMD: trevis-202vm4 { [[ -e /etc/SuSE-release ]] &&
service nfsserver restart; } ||
service nfs restart ||
service nfs-server restart
trevis-202vm4: Redirecting to /bin/systemctl restart nfs.service
trevis-202vm4: Failed to restart nfs.service: Unit nfs.service not found.
trevis-202vm4: Redirecting to /bin/systemctl restart nfs-server.service
trevis-202vm4: Job for nfs-server.service canceled.
pdsh@trevis-202vm1: trevis-202vm4: ssh exited with exit code 1
parallel-scale-nfsv4 : @@@@@@ FAIL: setup nfs failed!
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:6273:error()
= /usr/lib64/lustre/tests/parallel-scale-nfs.sh:68:main()
When we see this failure, so far, it is when node-provisioning/lustre-initialization takes place right before parallel-scale-nfsv4 is run. Logs for failures |
| Comments |
| Comment by Andreas Dilger [ 05/Jan/21 ] |
|
Have the NFS tools RPMs been installed? |
| Comment by James Nunez (Inactive) [ 05/Jan/21 ] |
|
parallel-scale-nfsv3 runs before parallel-scale-nfsv4, actually parallel-scale-nfsv3 runs and hangs which causes the cluster to run node-provisioning/lustre-initialization, and parallel-scale-nfsv3 does start the NFS servers. Looking at the suite_log for parallel-scale-nfsv3, at https://testing.whamcloud.com/test_sets/bc5183ad-2cad-4b97-aba4-604b73b9765f, the NFS server starts CMD: trevis-202vm4 { [[ -e /etc/SuSE-release ]] &&
service nfsserver restart; } ||
service nfs restart ||
service nfs-server restart
trevis-202vm4: Redirecting to /bin/systemctl restart nfs.service
trevis-202vm4: Failed to restart nfs.service: Unit nfs.service not found.
trevis-202vm4: Redirecting to /bin/systemctl restart nfs-server.service
CMD: trevis-202vm1.trevis.whamcloud.com,trevis-202vm2 chkconfig --list rpcidmapd 2>/dev/null |
grep -q rpcidmapd && service rpcidmapd restart ||
true
Mounting NFS clients (version 3)...
Looking at the MDS (vm4) console log, we see acknowledgment from NFSD before parallel-scale-nfsv3 starts running tests [64667.180020] Lustre: DEBUG MARKER: { [[ -e /etc/SuSE-release ]] &&
[64667.180020] service nfsserver restart; } ||
[64667.180020] service nfs restart ||
[64667.180020] service nfs-server restart
[64667.719483] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
[64668.019298] NFSD: Using nfsdcld client tracking operations.
[64668.020325] NFSD: no clients to reclaim, skipping NFSv4 grace period (net f0000098)
[64671.281631] Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests:
Before parallel-scale-nfsv4 starts, we don't see the same [ 344.752738] Lustre: DEBUG MARKER: { [[ -e /etc/SuSE-release ]] &&
[ 344.752738] service nfsserver restart; } ||
[ 344.752738] service nfs restart ||
[ 344.752738] service nfs-server restart
[ 345.178077] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
[ 345.638306] Lustre: DEBUG MARKER: /usr/sbin/lctl mark parallel-scale-nfsv4 : @@@@@@ FAIL: setup nfs failed!
[ 346.014296] Lustre: DEBUG MARKER: parallel-scale-nfsv4 : @@@@@@ FAIL: setup nfs failed!
So, the NFS RPMs were loaded on the servers. |
| Comment by Sarah Liu [ 23/Mar/22 ] |
|
+1 in interop testing between master(el8.5) and 2.12 client(el7.9) in nfsv3 testing |
| Comment by Gerrit Updater [ 07/Nov/22 ] |
|
"Alex Deiter <alex.deiter@gmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49062 |
| Comment by Gerrit Updater [ 19/May/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49062/ |
| Comment by Peter Jones [ 19/May/23 ] |
|
Landed for 2.16 |
| Comment by Gerrit Updater [ 12/Jun/23 ] |
|
"Alex Deiter <alex.deiter@gmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51283 |
| Comment by Gerrit Updater [ 02/Aug/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51283/ |