[LU-5761] replay-single test_89: @@@@@@ FAIL: 2560 blocks leaked Created: 17/Oct/14  Updated: 27/Aug/19  Resolved: 17/Mar/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0, Lustre 2.11.0
Fix Version/s: Lustre 2.11.0

Type: Bug Priority: Critical
Reporter: Andreas Dilger Assignee: Andreas Dilger
Resolution: Fixed Votes: 0
Labels: zfs

Issue Links:
Duplicate
is duplicated by LU-9891 replay-ost-single test_7: 15995648 > ... Resolved
Related
is related to LU-1867 replay-single test_89: @@@@@@ FAIL: 4... Resolved
is related to LU-10052 replay-single test_20b fails with 'af... Resolved
is related to LU-9590 replay-single test cases 61d 89 73b r... Resolved
is related to LU-8672 missing error handling in replay-sing... Resolved
Severity: 3
Rank (Obsolete): 16168

 Description   

Hit this problem on Maloo test on latest master branch:
https://maloo.whamcloud.com/test_sets/07148716-fae1-11e1-a03c-52540035b04c

It is similiar with ORI-412 reported on Orion.

Test logs of test_89 attached.



 Comments   
Comment by Andreas Dilger [ 17/Oct/14 ]

Open this as a separate version of LU-1867 for ZFS.

Comment by James Nunez (Inactive) [ 08/May/15 ]

I've run this test a few times recently and it has passed. Is this still an issue?

The following are results for replay-single, with ZFS, where test 89 pass.
2015-03-05 14:14:05 - https://testing.hpdd.intel.com/test_sets/27f9c510-c350-11e4-be04-5254006e85c2
2015-04-22 02:47:39 - https://testing.hpdd.intel.com/test_sets/ab761c78-e8ed-11e4-9e6e-5254006e85c2
2015-04-24 20:29:25 - https://testing.hpdd.intel.com/test_sets/adb73a66-eb32-11e4-9620-5254006e85c2

I'm wondering if we should re-enable this test to get more test results and see if this test still fails.

Comment by Andreas Dilger [ 19/May/15 ]

James, could you please submit a patch to re-enable this test, and have it run the test a bunch of times to see if it is passing? I suspect that this is only an intermittent failure, so we don't want to enable it with only a single passing test run. See LU-1867 for more details.

Comment by James Nunez (Inactive) [ 28/May/15 ]

I've run this test several times and it still fails occasionally. Here's one failure:
https://testing.hpdd.intel.com/test_sets/7a0438b0-053e-11e5-89f6-5254006e85c2

Comment by Jian Yu [ 21/Nov/17 ]

More failure instances on master branch:

https://testing.hpdd.intel.com/test_sets/24875632-ce5a-11e7-9c63-52540065bddc

https://testing.hpdd.intel.com/test_sets/95511308-cbfd-11e7-8027-52540065bddc

Comment by Jinshan Xiong (Inactive) [ 25/Nov/17 ]

https://testing.hpdd.intel.com/test_sets/a6ab66ac-d1ad-11e7-9c63-52540065bddc

Comment by Andreas Dilger [ 26/Nov/17 ]

This test was just removed from ALWAYS_EXCEPT via patch https://review.whamcloud.com/27404 "LU-9590 tests: remove replay-single tests from ALWAYS_EXCEPT" on Oct 16th.

Comment by Gerrit Updater [ 27/Nov/17 ]

Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/30252
Subject: LU-5761 tests: re-add test_89 to replay-single ALWAYS_EXCEPT
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9c7cf4f79173951b1ea037a8f93e6fe5fe051190

Comment by Andreas Dilger [ 28/Nov/17 ]

It may well be that this test cannot pass as-is with ZFS, since the COW nature of ZFS means that there may be blocks that are not released for some time. Either we need to bump up the limit for ZFS (which makes the test much less useful), or wait longer for old TXGs to be released and free up the space.

Comment by Gerrit Updater [ 01/Dec/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30252/
Subject: LU-5761 tests: re-add test_89 to replay-single ALWAYS_EXCEPT
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 18ebd8cbe9dabe0f8c2da36fe6d146d977b500ae

Comment by Gerrit Updater [ 01/Feb/18 ]

Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/31120
Subject: LU-5761 tests: remove test_89 from replay-single ALWAYS_EXCEPT
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: fb1dcb3d7253c4daf0a588f8e7e955e9e3f1cd2d

Comment by Bob Glossman (Inactive) [ 13/Mar/18 ]

another on master:

https://testing.hpdd.intel.com/test_sets/c722f698-265d-11e8-9e0e-52540065bddc

 

Comment by Gerrit Updater [ 17/Mar/18 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31120/
Subject: LU-5761 tests: fix test_89 to use fs_log_size()
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e609d30003560ba534e3da42d3240a081b00143e

Comment by Peter Jones [ 17/Mar/18 ]

Landed for 2.11

Generated at Sat Feb 10 01:54:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.