[LU-16473] track when failloc was set but not triggered on unset Created: 13/Jan/23  Updated: 13/Jan/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Oleg Drokin Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

There's a somewhat common test problem where a failloc is being set, but does not trigger not leading to desired action and leading to cryptic errors, or sometimes false successes.

We already have functionality to see if failloc was triggered - used to implement a ONCE flag in particular.

The idea is to export this to userspace somehow on unset and then as a first step run once final unsetting would print a warning (perhaps aggregated across multiple nodes?) in test log if nothing was triggered.

Then tests like https://testing.whamcloud.com/test_sets/89df8dad-7b3f-414a-9c2a-7328807139c2 would have a much more visible message about what happened leading to much faster way to diagnose it.

Potentially other improved uses of this functionality will come up later


Generated at Sat Feb 10 03:27:19 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.