data corruption in check_set (LU-1039)

[LU-1791] sanity.sh test_224b takes too long to run Created: 27/Aug/12  Updated: 10/Mar/18  Resolved: 10/Mar/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.3.0, Lustre 2.4.0
Fix Version/s: None

Type: Technical task Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-614 Speed up test scripts Resolved
Rank (Obsolete): 5674

 Description   

The sanity.sh test_224b() that was added for LU-1039 takes over 15 minutes to complete just that one subtest (e.g. https://maloo.whamcloud.com/sub_tests/3d7bc580-ef9c-11e1-bdf7-52540035b04c). This needs to be improved so that it doesn't waste so much testing time (3-4h or more each day for all of the test nodes).

This appears to be because it is not timing out until AT max (900s) is hit.

We need a patch to change this test to reduce AT_max just for this test so that the test can complete more quickly.



 Comments   
Comment by Andreas Dilger [ 27/Aug/12 ]

Shadow, the test you added in commit c9590221dc43dd5e7a7ede389f0a7d9cf566e5bf takes a very long time to run, 939s in one recent case. Could you please look at this test and find a way to speed it up, without removing the functionality? I think reducing AT_max on the OSS for just this test should be enough, but since you wrote the test I want to make sure that this will not avoid the problem that it was trying to trigger.

Comment by Andreas Dilger [ 10/Mar/18 ]

Recent test runs are in the 3s range.

Generated at Sat Feb 10 01:19:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.