[LU-10207] sanity-sec test_18: mgs and c0 admin_nodemap mismatch, 10 attempts Created: 07/Nov/17  Updated: 24/Jan/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Casper Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

onyx, full
servers: sles12sp3, ldiskfs, branch master, v2.10.54.50, b3664
clients: sles12sp3, branch master, v2.10.54.50, b3664


Issue Links:
Related
is related to LU-10692 sanity-sec test_21: mgs and c1 admin_... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

session: https://testing.hpdd.intel.com/test_sessions/bc37e02d-7553-4aed-ad2c-43af808f0201
test set: https://testing.hpdd.intel.com/test_sets/0faad7aa-c164-11e7-88ab-52540065bddc

From test_log:

CMD: onyx-30vm10 /usr/sbin/lctl list_nids | grep tcp | cut -f 1 -d @
On onyx-30vm10 10.2.5.230, c0.admin_nodemap = nodemap.c0.admin_nodemap=1
CMD: onyx-30vm10 /usr/sbin/lctl list_nids | grep tcp | cut -f 1 -d @
On onyx-30vm10 10.2.5.230, c0.admin_nodemap = nodemap.c0.admin_nodemap=1
CMD: onyx-30vm10 /usr/sbin/lctl list_nids | grep tcp | cut -f 1 -d @
On onyx-30vm10 10.2.5.230, c0.admin_nodemap = nodemap.c0.admin_nodemap=1
MGS
nodemap.c0.admin_nodemap=0
OTHER - IP: 10.2.5.230
nodemap.c0.admin_nodemap=1
 sanity-sec test_18: @@@@@@ FAIL: mgs and c0 admin_nodemap mismatch, 10 attempts 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:5289:error()
  = /usr/lib64/lustre/tests/sanity-sec.sh:948:wait_nm_sync()
  = /usr/lib64/lustre/tests/sanity-sec.sh:1024:fops_test_setup()
  = /usr/lib64/lustre/tests/sanity-sec.sh:1300:test_fops()
  = /usr/lib64/lustre/tests/sanity-sec.sh:1424:test_18()
  = /usr/lib64/lustre/tests/test-framework.sh:5565:run_one()
  = /usr/lib64/lustre/tests/test-framework.sh:5604:run_one_logged()
  = /usr/lib64/lustre/tests/test-framework.sh:5451:run_test()
  = /usr/lib64/lustre/tests/sanity-sec.sh:1427:main()


 Comments   
Comment by James Nunez (Inactive) [ 21/Jan/18 ]

This test started to fail on 2017-09-28. Fails only for SLES12 SP2 and SLES12 SP3 and for the following branches/builds so far:
b2_10 #62
b2_10 (patchless) #7, 8, 20, 26
master # 3650, 3664, 3681, 3697

Comment by James Nunez (Inactive) [ 15/Mar/18 ]

We're seeing the same error for test 17 for SLES12 SP2 and SP3 testing. Since test 17 and test 18 are very similar tests, I'm attributing the sanity-sec test_17 failures to this ticket.

Here are some logs for the test 17 failure:

https://testing.hpdd.intel.com/test_sets/ca56bef4-2747-11e8-b3c6-52540065bddc

https://testing.hpdd.intel.com/test_sets/f4b0ed66-f30b-11e7-854b-52540065bddc

Generated at Sat Feb 10 02:33:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.