[LU-12261] sanity test 43b fails with 'expected error, got success' Created: 01/May/19  Updated: 11/Jun/19  Resolved: 02/May/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.13.0
Fix Version/s: Lustre 2.13.0

Type: Bug Priority: Critical
Reporter: James Nunez (Inactive) Assignee: Patrick Farrell (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

ARM clients


Issue Links:
Related
is related to LU-12195 sanity/43 and sanityn/14 fail on loca... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

sanity test_43b fails with 'expected error, got success' primarily for ARM architecture client testing. This test started failing on 2019-04-30.

Looking at one of the recent failures, logs at https://testing.whamcloud.com/test_sets/d1626222-6b7d-11e9-aeec-52540065bddc, we see the following in the suite_log

== sanity test 43b: truncate of file being executed should return -ETXTBSY =========================== 17:02:16 (1556643736)
 sanity test_43b: @@@@@@ FAIL: expected error, got success 
/usr/lib64/lustre/tests/sanity.sh: line 4172: /mnt/lustre/d43b.sanity/sleep: Text file busy

There are no errors in any of the node console logs.

There are several instances of this failure. Logs for a few are at
https://testing.whamcloud.com/test_sets/ea9ca12c-6b72-11e9-aeec-52540065bddc
https://testing.whamcloud.com/test_sets/1c74df3c-6b8e-11e9-8bb1-52540065bddc

There are a small number of x86_64 client failures
https://testing.whamcloud.com/test_sets/376809d8-6b78-11e9-a6f2-52540065bddc



 Comments   
Comment by James A Simmons [ 01/May/19 ]

That is bizarre. truncate is returning ETXTBUSY as it should but the test doesn't see that result.

Comment by Patrick Farrell (Inactive) [ 01/May/19 ]

Culprit is almost certainly:

commit 9a1f327a76f72c7713e53d8b354ff7f0e32be870
Author: Alex Zhuravlev <bzzz@whamcloud.com>
Date: Fri Apr 19 15:01:12 2019 +0300

LU-12195 tests: use sleep instead of wrapped multiop

in sanity/43* and sanity/14* tests as multiop is not a binary,
but libtool-wrapped script. the tests fail when started from a
build tree.

Change-Id: Iaec3433f03aab23583052373e5f0252d9eac7f04
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34721
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

diff --git a/lustre/tests/sanity.sh b/lustre/tests/sanity.sh

Comment by Gerrit Updater [ 01/May/19 ]

Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34791
Subject: LU-12261 tests: Race between exec and truncate
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: cd40a13a6e7a2efa4f967b597ed6c5aa0b7d7435

Comment by James A Simmons [ 01/May/19 ]

Thank you Patrick for fixing this.

Comment by Gerrit Updater [ 02/May/19 ]

Andreas Dilger (adilger@whamcloud.com) merged in patch https://review.whamcloud.com/34791/
Subject: LU-12261 tests: Race between exec and truncate
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: c64855fca1504bddcb0fc7ad7316d8d6b20a9c6f

Comment by Gerrit Updater [ 11/Jun/19 ]

James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/35189
Subject: LU-12261 tests: Race between exec and truncate
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 23f71de911fd2a8e9004322a772529da5a64a47f

Generated at Sat Feb 10 02:51:02 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.