[LU-1528] Test set with a failing subtest is flagged as passed in maloo Created: 23/Mar/12  Updated: 18/Jul/12  Resolved: 18/Jul/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.3.0
Fix Version/s: Lustre 2.3.0

Type: Bug Priority: Blocker
Reporter: Mike Stok (Inactive) Assignee: Minh Diep
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Project: Test Infrastructure
Rank (Obsolete): 2203

 Description   

https://maloo.whamcloud.com/test_sets/5dca0c26-7543-11e1-bfc7-5254004bbbd3

The test set shows up as passing in maloo, yet it clearly has a failed test set. As this was the only test set in the test session the test session also erroneously showed up as passed.



 Comments   
Comment by Mike Stok (Inactive) [ 23/Mar/12 ]
[root@maloo current]# RAILS_ENV=production bundle exec rails c
Loading production environment (Rails 3.1.4)
irb(main):001:0> test_set = TestSet.find('5dca0c26-7543-11e1-bfc7-5254004bbbd3')
  TestSet Load (8.5ms)  SELECT `test_sets`.* FROM `test_sets` WHERE `test_sets`.`id` = ? ORDER BY test_sets.submission DESC LIMIT 1  [["id", "5dca0c26-7543-11e1-bfc7-5254004bbbd3"]]
=> #<TestSet id: "5dca0c26-7543-11e1-bfc7-5254004bbbd3", test_session_id: "5b0bedce-7543-11e1-bfc7-5254004bbbd3", version: 2, created_at: "2012-03-23 23:53:19", updated_at: "2012-03-23 23:53:19", duration: 1801, submission: "2012-03-23 22:46:02", status: "PASS", test_set_script_id: "f60ca966-8b52-11e0-aab9-52540025f9af", sub_tests_count: 1, sub_tests_passed_count: 0, sub_tests_skipped_count: 0, sub_tests_failed_count: 1>
irb(main):002:0> 

Note that the status is PASS, yet the sub_tests_failed_count is 1.

Looking at the uploaded results.yml we see (leading line numbers added):

  1 TestGroup:
  2     test_group: acc-sm-client-13vm1
  3     testhost: client-13vm1
  4     submission: Fri Mar 23 15:46:02 PDT 2012
  5     user_name: root
  6 
  7 Tests:
  8 -
  9         name: posix
 10         description: auster posix
 11         submission: Fri Mar 23 15:46:02 PDT 2012
 12         report_version: 2
 13         SubTests:
 14         -
 15             name: test_1
 16             status: FAIL
 17             duration: 1793
 18             return_code: 0
 19             error: "Run\ POSIX\ testsuite\ on\ /mnt/lustre\ failed"
 20         duration: 1801
 21         status: PASS

Maloo seems to have mechanically recorded the FAIL from line 16 for the subtest and the PASS from line 21 for the test.

We could make maloo check for this, and it seems the test harness is not getting it right.

Comment by Mike Stok (Inactive) [ 30/Mar/12 ]

There's a similar problem with vbr in failover.

Comment by Chris Gearing (Inactive) [ 30/Mar/12 ]

Really a sub task of LU-1192 but I don't want to make this an LU issue.

Comment by Minh Diep [ 06/Apr/12 ]

The log above is from posix test which hasn't landed in lustre yet. we'll revisit this when we hit it again in autotest

Comment by Andreas Dilger [ 14/Jun/12 ]

This is definitely being seen in real testing. I found the following two cases when searching for lnet-selftest results, and it was obvious that "0/1" tests passed should not be considered a "PASS" result. Looking at the tests results shows the subtest reporting failure:

https://maloo.whamcloud.com/test_sets/5b53e0a0-b479-11e1-bdae-52540035b04c
https://maloo.whamcloud.com/test_sets/42ad0e1a-b3ce-11e1-8808-52540035b04c

This is fairly serious (hence increased priority), because it casts doubt on all of our pass results.

Comment by Minh Diep [ 14/Jun/12 ]

I tracked the second report from above and here is the result from yaml file

-
name: lnet-selftest
description: auster lnet-selftest
submission: Mon Jun 11 06:49:27 PDT 2012
report_version: 2
SubTests:
-
name: test_smoke
status: FAIL
duration: 345
return_code: 254
error: "test_smoke\ returned\ 254"
duration: 357
status: PASS

the overal test status should be failed. Notice that the second subtest never started nor completed.

Comment by Minh Diep [ 15/Jun/12 ]

moved to LU since this is likely to be test-framework issue

Comment by Minh Diep [ 27/Jun/12 ]

patch http://review.whamcloud.com/#change,3112

Comment by Minh Diep [ 18/Jul/12 ]

landed in master

Generated at Sat Feb 10 01:17:26 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.