[LU-16307] sanity-sec: test_31: export for 10.240.26.216@tcp on MGS should not exist Created: 10/Nov/22  Updated: 29/Jan/24

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: James A Simmons
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-10003 lnetctl error "cannot add network: in... Reopened
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for James Simmons <uja.ornl@gmail.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/9226f4b3-5eba-49de-b55c-934d02bd243f

sanity-sec test_31 failed with:

CMD: onyx-82vm1.onyx.whamcloud.com mkdir -p /mnt/lustre
CMD: onyx-82vm1.onyx.whamcloud.com mount -t lustre -o user_xattr,flock,network=tcp999 10.240.26.219@tcp999:/lustre /mnt/lustre
mount.lustre: mount 10.240.26.219@tcp999:/lustre at /mnt/lustre failed: Invalid argument
This may have multiple causes.
Is 'lustre' the correct filesystem name?
Are the mount options correct?
Check the syslog for more info.
missing mandatory parameters in NI config: 'interface'Starting client: onyx-82vm1.onyx.whamcloud.com:  -o user_xattr,flock,network=tcp999 10.240.26.219@tcp999:/lustre /mnt/lustre
CMD: onyx-82vm1.onyx.whamcloud.com mkdir -p /mnt/lustre
CMD: onyx-82vm1.onyx.whamcloud.com mount -t lustre -o user_xattr,flock,network=tcp999 10.240.26.219@tcp999:/lustre /mnt/lustre
CMD: onyx-82vm4 lctl get_param -n *.MGS*.exports.'10.240.26.216@tcp'.uuid 2>/dev/null | grep -q -
 sanity-sec test_31: @@@@@@ FAIL: export for 10.240.26.216@tcp on MGS should not exist 

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity-sec test_31 - export for



 Comments   
Comment by Andreas Dilger [ 10/Nov/22 ]

James, when using the "[Raise bug...]" option in Maloo, please do so on the subtest (i.e. under test_31) so that Maloo can generate a proper matching string for the failure, and also include additional information into the ticket (e.g. the error messages, etc.). This current bug description "sanity-sec" isn't very useful as it currently stands. Also, it is important to have a good bug summary in the initial submission, since Maloo will cache this initial summary line and not update it if it is later changed.

Comment by Andreas Dilger [ 10/Nov/22 ]

It looks like this failure started on 2022-11-08. There were a number of patches landed on that date:

ed71af843d LU-16280 tests: conf-sanity/117 call setup when FS is not mounted
9a0a89520e LU-16251 obdclass: fill jobid in a safe way
b104c0a277 LU-14377 tests: make parallel-scale/rr_alloc less strict
38a52f2fc3 LU-16243 tests: Specify ping source in test_218
d55fe25a95 LU-16215 kfilnd: Use immediate for routed GETs
78c681d9f4 LU-16207 build: add rpm-build BuildRequires for SLES15 SP3
95f7ef6094 LU-16111 build: Fix include of stddef.h
0171801df5 LU-8151 obd: Show correct shadow mountpoints for server
26e765e0d2 LU-15011 tests: additional checks for pool spilling
41610e6207 LU-8837 mgc: move server-only code out of mgc_request.c
f32bdde057 LU-8837 lustre: make ldlm and target file lists
ef90a02d12 LU-13135  quota: improve checks in OSDs to ignore quota
896cd5b7bc LU-10391 lnet: fix build issue when IPv6 is disabled.
122644ae19 LU-15847 target: report multiple transno to debug log
cf121b1668 LU-16203 llog: skip bad records in llog
53b3d1de99 LU-16152 utils: fix integer overflow in cYAML parser
db0fb8f2b6 LU-10391 lnet: allow ping packet to contain large nids
5243630b09 LU-15947 obdclass: improve precision of wakeups for mod_rpcs
8f8f6e2f36 LU-10003 lnet: use Netlink to support old and new NI APIs.
85941b9fb9 LU-15117 ofd: no lock for dt_bufs_get() in read path

The patch that seems the most likely to cause this problem is:

8f8f6e2f36 LU-10003 lnet: use Netlink to support old and new NI APIs.

since it also failed in this same way once on 2022-10-11 before it landed.

Comment by Andreas Dilger [ 10/Nov/22 ]

James, it looks like this failure relates to your netlink patch, per my previous comment.

Comment by Andreas Dilger [ 10/Nov/22 ]

It also looks like this is causing a 100% test failure for any patches including this change. The few passes since 2022-11-08 have been based on commit 293844d132 "LU-16222 kernel: RHEL 8.7 client and server support", or an earlier parent.

As such, I'm going to push a revert patch to start it testing, but you are free to also work on a fix in parallel if you have an idea how to fix it.

Comment by Gerrit Updater [ 10/Nov/22 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49111
Subject: LU-16307: Revert "LU-10003 lnet: use Netlink to support old and new NI APIs."
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 15f994bbdc7e4068cd752772bc40f5d426cfadab

Comment by James A Simmons [ 10/Nov/22 ]

Why didn't this show up before? Let me see if I can reproduce it and get a fix out.

Comment by Gerrit Updater [ 10/Nov/22 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49127
Subject: LU-16307 tests: run sanity-sec test_31 in sanity-lnet
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 3232417e9a8a63676be8a1dc0e04f34346b3b24d

Comment by Andreas Dilger [ 11/Nov/22 ]

Why didn't this show up before? Let me see if I can reproduce it and get a fix out.

Because the test failure was in sanity-sec.sh test_31 but "Test-Parameters: trivial" does not run sanity-sec. This subest is doing funky things with lnetctl so it really should be run whenever LNet changes are made. I've pushed patch 49127 so that this one subtest is always run when sanity-lnet.sh is run.

Comment by James A Simmons [ 11/Nov/22 ]

Also I found this test is not run when it is run from the source tree or its all run on the local node. This explains why Oleg never saw an issue.

I found the source of the bug. I was a bit to aggressive in testing conditions in lnetctl. Testing the fix now.

Comment by Andreas Dilger [ 11/Nov/22 ]

It looks like the lack of local testing is because "lnet/utils" is not added to the PATH when run from the local tree. That should also be fixed in your patch, so this test is run more regularly.

Comment by Gerrit Updater [ 11/Nov/22 ]

"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49129
Subject: LU-16307 util: fix lnetctl bugs that break sanity-sec
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6b68cfd683e9c067c3ee6f395be54ff663952246

Comment by Gerrit Updater [ 11/Nov/22 ]

"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49143
Subject: LU-16307 test: expand sanity-lnet 301 test
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7394af8e41f95fa2de17f9d55b9b450477301907

Comment by Gerrit Updater [ 15/Nov/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49129/
Subject: LU-16307 util: fix lnetctl bugs that break sanity-sec
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 5055381d9e766c0353d3881119a300b6fa60c10b

Comment by Gerrit Updater [ 16/Dec/22 ]

"Cyril Bordage <cbordage@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49425
Subject: LU-16307 tests: run sanity-sec test_31 in sanity-lnet
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a9877fe045daf923a9d7b56cc5589a9cc5409836

Comment by Serguei Smirnov [ 31/May/23 ]

+1 on master: https://testing.whamcloud.com/test_sets/d3ab6e4e-d9cc-4cfd-819d-8dc068bb9ad5

Comment by Gerrit Updater [ 25/Jan/24 ]

"Sebastien Buisson <sbuisson@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53818
Subject: LU-16307 tests: fix sanity-sec test_31
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f4a96799159fd662855542d471197ac4060d3295

Generated at Sat Feb 10 03:25:50 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.