[LU-14667] sanity test_27J: Timeout occurred after 102 mins, last suite running was sanity Created: 04/May/21  Updated: 28/Apr/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Chris Horn <hornc@cray.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/9324493f-fd08-4ddf-a0ca-114daef69104

test_27J failed with the following error:

Timeout occurred after 102 mins, last suite running was sanity

Very similar to the other timeout issues. No apparent problem with the patch under test.

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity test_27J - Timeout occurred after 102 mins, last suite running was sanity



 Comments   
Comment by Andreas Dilger [ 04/May/21 ]

This may have been caused by a hardware problem on the test node (ARM), which AFAIK is known to be a bit flakey. The test logs look like an abrupt node halt, and then during reboot it continually dumps stacks in the core kernel:

[ 2541.276200] Lustre: DEBUG MARKER: == sanity test 27J: basic ops on file with foreign LOV =============================================== 19:37:49 (1620070669)

<ConMan> Console [onyx-24vm1] disconnected from <onyx-24:6000>.

There haven't been any other failures in this particular test in many weeks, and definitely these patches don't seem related to the test that failed.

Generated at Sat Feb 10 03:11:44 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.