[LU-11127] sanity-flr test_34b: @@@@@@ FAIL: can\'t put import for osc into FULL state after 40 sec, have REPLAY_WAIT Created: 07/Jul/18 Updated: 06/Aug/18 Resolved: 06/Aug/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0 |
| Fix Version/s: | Lustre 2.12.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Mikhail Pershin | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
The failure rate for sanity-flr.sh test_34b is about 10%. Test is failed always with the same error like below: [12008.812898] Lustre: DEBUG MARKER: trevis-17vm1.trevis.whamcloud.com: executing wait_import_state FULL osc.lustre-OST0001-osc-ffff926024c37800.ost_server_uuid 40 [12049.499056] Lustre: DEBUG MARKER: /usr/sbin/lctl mark rpc test_34b: @@@@@@ FAIL: can\'t put import for osc.lustre-OST0001-osc-ffff926024c37800.ost_server_uuid into FULL state after 40 sec, have REPLAY_WAIT examples: |
| Comments |
| Comment by James Nunez (Inactive) [ 13/Jul/18 ] |
|
This test started failing on July 2, 2018 and is failing only in DNE testing. |
| Comment by Mikhail Pershin [ 22/Jul/18 ] |
|
it is failing a lot of runs: Failure Rate: 22.41% of most recent 58 runs, 42 skipped (all branches) |
| Comment by Andreas Dilger [ 01/Aug/18 ] |
|
There isn't anything in test_34b() that seems any different than test_34a(), but 34a has not had any failures. This might relate to some kind of problem with stopping and restarting the OSTs twice in a row quickly under DNE (e.g. MDT reconnections slow, or the previous restart has reset at_min)? One option would be to increase the minimum time that _wait_osc_import_state() waits with multiple MDTs by some amount to compensate. |
| Comment by Gerrit Updater [ 01/Aug/18 ] |
|
James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/32917 |
| Comment by James Nunez (Inactive) [ 01/Aug/18 ] |
|
Uploaded patch https://review.whamcloud.com/32917 to stop running test 34b in case we need to employ these drastic measures! |
| Comment by Gerrit Updater [ 02/Aug/18 ] |
|
Bobi Jam (bobijam@hotmail.com) uploaded a new patch: https://review.whamcloud.com/32922 |
| Comment by Gerrit Updater [ 06/Aug/18 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32922/ |
| Comment by Peter Jones [ 06/Aug/18 ] |
|
Looks like we don't! |