[LU-11128] replay-single test timeout - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.12.0
Affects Version/s: Lustre 2.12.0
Labels:
None

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

This issue was created by maloo for bobijam <bobijam@whamcloud.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/5c95f0b2-8186-11e8-b441-52540065bddc

test_115 failed with the following error:

Timeout occurred after 216 mins, last suite running was replay-single, restarting cluster to continue tests

MDS dmesg keeps showing following error messages during several tests, and the test takes too much time.

[ 2545.541360] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 2545.571570] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.9.5.210@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 2545.618732] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.9.5.212@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 2545.618926] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.9.5.212@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 2545.619112] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.9.5.212@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
...

another hit also happens at https://testing.whamcloud.com/test_sets/08372d04-8188-11e8-97ff-52540065bddc

test_80c 'Timeout occurred after 159 mins, last suite running was replay-single, restarting cluster to continue tests'

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
replay-single test_115 - Timeout occurred after 216 mins, last suite running was replay-single, restarting cluster to continue tests

Attachments

Issue Links

is duplicated by

LU-11126 replay-single test 89: import.c:1668:ptlrpc_disconnect_idle_interpret()) ASSERTION( imp->imp_state == LUSTRE_IMP_CONNECTING ) failed:

Resolved

LU-11183 sanity test 244 hangs with no information in the logs

Resolved

is related to

LU-11362 sanity test_156: timeout loop in ptlrpc_check_set()

Open

LU-11269 ptlrpc_set_add_req()) ASSERTION( req->rq_import->imp_state != LUSTRE_IMP_IDLE ) failed

Resolved

LU-11405 add a test for idle connection feature

Closed

is related to

LU-7236 OST connect and disconnect on demand

Resolved

mentioned in: Page Loading...

(1 is related to , 1 mentioned in)

Activity

People

Assignee:: Alex Zhuravlev

Reporter:: Maloo

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 08/Jul/18 1:33 PM

Updated:: 28/May/19 5:29 PM

Resolved:: 02/Oct/18 9:55 PM