[LU-3144] failover: replay-ost-single test_6: FAIL: 13732304 > 13697488 + logsize 50 Created: 10/Apr/13  Updated: 22/Jan/18  Resolved: 22/Jan/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0, Lustre 2.4.1, Lustre 2.5.0, Lustre 2.6.0, Lustre 2.5.1, Lustre 2.11.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Jian Yu Assignee: Nathaniel Clark
Resolution: Duplicate Votes: 0
Labels: ldiskfs, yuc2
Environment:

Lustre Branch: master
Lustre Build: http://build.whamcloud.com/job/lustre-master/1381/
Distro/Arch: RHEL6.3/x86_64
Test Group: failover
FAILURE_MODE=HARD


Issue Links:
Duplicate
duplicates LU-3053 Test failure on test suite replay-ost... Resolved
duplicates LU-9891 replay-ost-single test_7: 15995648 > ... Resolved
Severity: 3
Rank (Obsolete): 7630

 Description   

The replay-ost-single test 6 failed as follows:

Waiting for orphan cleanup...
CMD: client-32vm3 /usr/sbin/lctl list_param osp.*osc*.old_sync_processed 2> /dev/null
osp.lustre-OST0000-osc-MDT0000.old_sync_processed
osp.lustre-OST0001-osc-MDT0000.old_sync_processed
osp.lustre-OST0002-osc-MDT0000.old_sync_processed
osp.lustre-OST0003-osc-MDT0000.old_sync_processed
osp.lustre-OST0004-osc-MDT0000.old_sync_processed
osp.lustre-OST0005-osc-MDT0000.old_sync_processed
osp.lustre-OST0006-osc-MDT0000.old_sync_processed
CMD: client-32vm3 /usr/sbin/lctl get_param -n osp.*osc*.old_sync_processed
Waiting for local destroys to complete
before: 13732304 after: 13697488
 replay-ost-single test_6: @@@@@@ FAIL: 13732304 > 13697488 + logsize 50

Maloo report: https://maloo.whamcloud.com/test_sets/8611adc4-a1bb-11e2-bdac-52540035b04c



 Comments   
Comment by Andreas Dilger [ 10/Apr/13 ]

Nathaniel, it looks like this is similar to LU-3053. If so, please close as a duplicate.

Note that this test was run on a commit 49b06fba39e7fec26a0250ed37f04a620e349b5f, which includes your fix from http://review.whamcloud.com/5875 (which is 768d5eb57d101ab79ff45f5094a91063a91705d8).

Comment by Nathaniel Clark [ 16/Apr/13 ]

This is an ldiskfs failure post recovery and not a zfs failure post dd a la LU-3053.

Another ldiskfs failure:
https://maloo.whamcloud.com/test_sets/6072cc8c-a621-11e2-8db3-52540035b04c

Comment by Nathaniel Clark [ 24/Apr/13 ]

All of the failures of this test come from test group: failover
both linked above
https://maloo.whamcloud.com/test_sets/ce300da4-aa1b-11e2-8184-52540035b04c
https://maloo.whamcloud.com/test_sets/66c9ccf4-aa0c-11e2-bd49-52540035b04c

Comment by Jian Yu [ 13/May/13 ]

Lustre Branch: master
Distro/Arch: RHEL6.4/x86_64
Test Group: failover

Lustre Build: http://build.whamcloud.com/job/lustre-master/1486
https://maloo.whamcloud.com/test_sets/4769c224-bb19-11e2-8824-52540035b04c

Lustre Build: http://build.whamcloud.com/job/lustre-master/1481
https://maloo.whamcloud.com/test_sets/9a4ce358-b8f9-11e2-891d-52540035b04c

Comment by Jian Yu [ 20/May/13 ]

Lustre Tag: v2_4_0_RC1
Lustre Build: http://build.whamcloud.com/job/lustre-master/1501/
Distro/Arch: RHEL6.4/x86_64
Test Group: failover

The issue occurred again:
https://maloo.whamcloud.com/test_sets/d437e18c-c00e-11e2-8398-52540035b04c

Comment by Jian Yu [ 28/May/13 ]

Lustre Tag: v2_4_0_RC2
Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/12/
Distro/Arch: RHEL6.4/x86_64
Test Group: failover

The issue occurred again:
https://maloo.whamcloud.com/test_sets/a56bd9a8-c6c6-11e2-ae4e-52540035b04c

Comment by Jian Yu [ 12/Aug/13 ]

More instances on Lustre b2_4 branch:
https://maloo.whamcloud.com/test_sets/752aa308-02dc-11e3-a4b4-52540035b04c
https://maloo.whamcloud.com/test_sets/32e7db40-fd15-11e2-b90c-52540035b04c

Comment by Sarah Liu [ 09/Sep/13 ]

Hit the same issue on lustre-master build #1652 failover test

https://maloo.whamcloud.com/test_sets/a6ff70cc-178e-11e3-a71f-52540035b04c

Comment by Jian Yu [ 09/Sep/13 ]

Lustre Tag: v2_4_1_RC2
Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/45/
Distro/Arch: RHEL6.4/x86_64
Testgroup: failover

The failure occurred again:
https://maloo.whamcloud.com/test_sets/86fe6484-194d-11e3-bb73-52540035b04c

Comment by Sarah Liu [ 20/Sep/13 ]

Also hit the error on master branch, failover test of SLES11 sp2 client
https://maloo.whamcloud.com/test_sets/ce2321ee-1e0f-11e3-b3c9-52540035b04c

Comment by Jian Yu [ 26/Jan/14 ]

More instances on Lustre b2_5 branch:
https://maloo.whamcloud.com/test_sets/24901c48-8644-11e3-9f3f-52540035b04c
https://maloo.whamcloud.com/test_sets/4cef0ef4-8f33-11e3-b8e1-52540035b04c

Comment by Sarah Liu [ 12/Feb/14 ]

also seen in master failover test:

https://maloo.whamcloud.com/test_sets/cdce9c50-9088-11e3-91ee-52540035b04c

server and client: lustre-master build # 1877 RHEL6 ldiskfs

Generated at Sat Feb 10 01:31:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.