[LU-8398] conf-sanity test_32a: test_32a failed with 1 Created: 14/Jul/16 Updated: 28/Apr/20 Resolved: 28/Apr/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0, Lustre 2.10.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for Bob Glossman <bob.glossman@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/4f228b3e-4975-11e6-9f8e-5254006e85c2. The sub-test test_32a failed with the following error: test_32a failed with 1 while this fail is in the same test as persistent mount opts: Parameters: lov.stripecount=0 lov.stripesize=1048576 mdt.identity_upcall=/usr/sbin/l_getidentity sys.timeout=20 exiting before disk write. IOC_LIBCFS_GET_NI error 22: Invalid argument and CMD: onyx-64 mount -t lustre -o exclude=t32fs-OST0000 t32fs-mdt1/mdt1 /tmp/t32/mnt/mdt onyx-64: mount.lustre: mount t32fs-mdt1/mdt1 at /tmp/t32/mnt/mdt failed: No such file or directory onyx-64: Is the MGS specification correct? onyx-64: Is the filesystem name correct? onyx-64: If upgrading, is the copied client log valid? (see upgrade docs) CMD: onyx-64 losetup -a conf-sanity test_32a: @@@@@@ FAIL: Mounting the MDT Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:4713:error_noexit() = /usr/lib64/lustre/tests/conf-sanity.sh:1730:t32_test() = /usr/lib64/lustre/tests/conf-sanity.sh:2066:test_32a() = /usr/lib64/lustre/tests/test-framework.sh:4991:run_one() = /usr/lib64/lustre/tests/test-framework.sh:5028:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:4893:run_test() = /usr/lib64/lustre/tests/conf-sanity.sh:2070:main() and CMD: onyx-64 zpool destroy t32fs-ost1 conf-sanity test_32a: @@@@@@ FAIL: test_32a failed with 1 Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:4713:error_noexit() = /usr/lib64/lustre/tests/test-framework.sh:4744:error() = /usr/lib64/lustre/tests/test-framework.sh:4991:run_one() = /usr/lib64/lustre/tests/test-framework.sh:5028:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:4893:run_test() = /usr/lib64/lustre/tests/conf-sanity.sh:2070:main() Info required for matching: conf-sanity 32a |
| Comments |
| Comment by Jian Yu [ 19/Aug/16 ] |
|
More failure instances on master branch: |
| Comment by Jian Yu [ 19/Aug/16 ] |
|
Console log on MDS: Lustre: DEBUG MARKER: mount -t lustre -o exclude=t32fs-OST0000 t32fs-mdt1/mdt1 /tmp/t32/mnt/mdt Lustre: MGS: Connection restored to MGC192.168.5.144@o2ib_0 (at 0@lo) Lustre: Skipped 32 previous similar messages LustreError: 126628:0:(ldlm_lib.c:459:client_obd_setup()) can't add initial connection LustreError: 126628:0:(osp_dev.c:1150:osp_init0()) t32fs-MDT0001-osp-MDT0000: can't setup obd: rc = -2 LustreError: 126628:0:(obd_config.c:578:class_setup()) setup t32fs-MDT0001-osp-MDT0000 failed (-2) LustreError: 126628:0:(obd_config.c:1671:class_config_llog_handler()) MGC192.168.5.144@o2ib: cfg command failed: rc = -2 Lustre: cmd=cf003 0:t32fs-MDT0001-osp-MDT0000 1:t32fs-MDT0001_UUID 2:10.100.4.87@tcp LustreError: 15c-8: MGC192.168.5.144@o2ib: The configuration from log 't32fs-MDT0000' failed (-2). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. LustreError: 126533:0:(obd_mount_server.c:1352:server_start_targets()) failed to start server t32fs-MDT0000: -2 LustreError: 126533:0:(obd_mount_server.c:1844:server_fill_super()) Unable to start targets: -2 Lustre: Failing over t32fs-MDT0000 LustreError: 126533:0:(obd_mount.c:1453:lustre_fill_super()) Unable to mount (-2) Lustre: DEBUG MARKER: losetup -a Lustre: DEBUG MARKER: /usr/sbin/lctl mark conf-sanity test_32a: @@@@@@ FAIL: Mounting the MDT All of the failures occurred on onyx-[64-67] test nodes with IB network. |
| Comment by Jian Yu [ 19/Aug/16 ] |
|
The failure occurred before in |
| Comment by nasf (Inactive) [ 13/Sep/16 ] |
|
Another failure instance on master: |
| Comment by Gu Zheng (Inactive) [ 10/Oct/16 ] |
|
Similar instance on master, but different error number. |
| Comment by Niu Yawei (Inactive) [ 26/Oct/16 ] |
|
The network interface on onyx-[64-67] is IB, but I'm not sure why http://review.whamcloud.com/#/c/6197/ didn't make it work, probably there are multiple types of interfaces on onyx-[64-67], and that case can't be handled well by test script? |
| Comment by Bob Glossman (Inactive) [ 01/Dec/16 ] |
|
more on master: |
| Comment by James Casper [ 02/Feb/17 ] |
|
In master branch, v2.9.52, b3499, the conf-sanity test_32a failure also caused 11 subsequent subtest failures (after 32a). |
| Comment by ZhangWei [ 13/Jul/17 ] |
|
I found an issue ( https://jira.hpdd.intel.com/browse/LU-9760 ) seems very much like this one, can some one help about this ? |
| Comment by Minh Diep [ 01/Feb/18 ] |
|
+1 on master: |
| Comment by Andreas Dilger [ 28/Apr/20 ] |
|
Close old issue that has not been reported in a long time. |