[LU-11637] Maloo should restart the sessions automatically if tests failed with known issue Created: 07/Nov/18  Updated: 19/Apr/22  Resolved: 19/Apr/22

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major
Reporter: Elena Gryaznova Assignee: James Nunez (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None

Rank (Obsolete): 9223372036854775807

 Description   

It looks reasonable to add the following Maloo functionality: automatically re-start the sessions failed due to known issues. If Maloo has all failures associated with LU tickets it will be more effective if Maloo will re-start the sessions by itself instead of waiting patch developer attention.
We have such functionality in our automation, this finally saves the time for developers.
Thanks.



 Comments   
Comment by Elena Gryaznova [ 07/Nov/18 ]

It is even more desirable the following improvement:
Maloo is to set +1 if there are only the failures due to known issues and all are associated with LU tickets.

Comment by Alexander Boyko [ 29/Nov/18 ]

I agree with Elena, it makes process more efficient. @James Nunez any thoughts? 

Comment by Sergey Cheremencev [ 29/Nov/18 ]

I also faced the issue.

Example from https://review.whamcloud.com/#/c/33391/3.

sanityn_101a is set as failed in https://testing.whamcloud.com/test_sessions/5c08b8e3-4908-408d-b015-cedd83a40854 , however it matches LU-10279.

Comment by Sergey Cheremencev [ 04/Dec/18 ]

More examples:

https://testing.whamcloud.com/test_sets/a5ef946e-ecdb-11e8-adf2-52540065bddc

test 80f failed with known failure LU-11366 and didn't marked as known - had to associate a bug by hands

Another strange thing is a test suits marked as FAIL despite there is no failed tests.
For example - insanity, replay-ost-single, sanity-quota, sanity-flr and sanity-pfl here https://testing.whamcloud.com/test_sessions/e8ac067a-da35-4aea-896b-e40f5f4b3029

according to above link replay-ost-single hasn't started - "subtests passed" field is 0/0

Comment by Cory Spitz [ 20/Oct/21 ]

jamesanunez, do you have an estimate about how difficult it would be to implement this enhancement? Is it possible to bang out in short order (now to help prepare for 2.15.0)?

Comment by Cory Spitz [ 18/Nov/21 ]

jamesanunz, I think that this could help everyone a lot. I don't really know how difficult it would be though. Do you think it is worth it? pjones, perhaps we can chat about it at an upcoming LWG call?

Comment by James Nunez (Inactive) [ 19/Apr/22 ]

We talked about this during a recent OpenSFS LWG call and everyone agreed that we should not implement automatic restart of test sessions.

Generated at Sat Feb 10 02:45:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.