Details
Description
replay-single test_70b fails with two error messages
replay-single test_70b: @@@@@@ FAIL: dbench stopped on some of onyx-31vm1.onyx.hpdd.intel.com,onyx-31vm2!
and later
replay-single test_70b: @@@@@@ FAIL: rundbench load on onyx-31vm1.onyx.hpdd.intel.com,onyx-31vm2 failed!
Looking at the suite_log, we see
CMD: onyx-31vm1.onyx.hpdd.intel.com,onyx-31vm2 killall -0 dbench onyx-31vm1: [3] open ./clients/client0 failed for handle 16385 (No such file or directory) onyx-31vm1: (4) ERROR: handle 16385 was not found onyx-31vm1: Child failed with status 1 onyx-31vm1: dbench: no process found onyx-31vm1: dbench: no process found replay-single test_70b: @@@@@@ FAIL: dbench stopped on some of onyx-31vm1.onyx.hpdd.intel.com,onyx-31vm2!
The only thing that looks suspicious in the console logs is on the MDS1, 3
[ 5354.241985] Lustre: DEBUG MARKER: Started rundbench load pid=3403 ... [ 5354.488828] LustreError: 12371:0:(osd_oi.c:978:osd_idc_find_or_init()) lustre-MDT0000: can't lookup: rc = -2 [ 5354.753146] Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-single test_70b: @@@@@@ FAIL: dbench stopped on some of onyx-31vm1.onyx.hpdd.intel.com,onyx-31vm2!
This test has failed in this way many times, so far, for only full test sessions with DNE configured and ZFS:
2.10.57 el7 build 3703 – https://testing.hpdd.intel.com/test_sets/46a0b60a-078f-11e8-bd00-52540065bddc
2.10.57 el7 build 3702 – https://testing.hpdd.intel.com/test_sets/13cdeb9e-0352-11e8-a10a-52540065bddc
2.10.57 el7 build 3700 - https://testing.hpdd.intel.com/test_sets/fa0a850e-014f-11e8-a6ad-52540065bddc
2.10.57 el7 build 3697 - https://testing.hpdd.intel.com/test_sets/ebd4b25e-fd83-11e7-a7cd-52540065bddc
2.10.57 el7 patchless build 59 – https://testing.hpdd.intel.com/test_sets/dee6191a-ffaf-11e7-a6ad-52540065bddc
2.10.57 el7 patchless build 58 – https://testing.hpdd.intel.com/test_sets/16fa9310-fe7c-11e7-a6ad-52540065bddc
2.10.56 el7 build 3693 – https://testing.hpdd.intel.com/test_sets/d309f58a-f77b-11e7-bd00-52540065bddc
2.10.56 el7 patchless build 53 – https://testing.hpdd.intel.com/test_sets/38f48bae-f636-11e7-94c7-52540065bddc
2.10.56 el7 patchless build 50 – https://testing.hpdd.intel.com/test_sets/c46aeb7c-f228-11e7-8c43-52540065bddc
2.10.56 el7 build 3685 – https://testing.hpdd.intel.com/test_sets/6c00afc0-e7c0-11e7-8027-52540065bddc
2.10.56 el7 patchless build 44 – https://testing.hpdd.intel.com/test_sets/53f8d684-e674-11e7-a066-52540065bddc
Attachments
Issue Links
- is duplicated by
-
LU-14791 replay-single: rundbench load on trevis-66vm1.trevis.whamcloud.com,trevis-66vm2 failed!
- Resolved
-
LU-14813 replay-single: test_70b dbench failed
- Resolved
- is related to
-
LU-16336 LFSCK should fix inconsistencies caused by recovery abort
- Open
-
LU-16065 replay-single test_81a: rm remote dir failed
- Open
- is related to
-
LU-15624 replay-single and ost-pools failed: rm: cannot remove 'd70b.replay-single': Directory not empty
- Open
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...