LFSCK phase II technical debts (LU-4701)

[LU-4763] sanity-scrub test 11: FAIL: (8) Expect 0 objects skipped on mds1, but got 14 Created: 13/Mar/14  Updated: 02/Sep/14  Resolved: 30/Apr/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.1, Lustre 2.4.3
Fix Version/s: Lustre 2.6.0

Type: Technical task Priority: Blocker
Reporter: Jian Yu Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: mn4
Environment:

Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/41/ (2.5.1 RC3)
Distro/Arch: RHEL6.5/x86_64(server), SLES11SP3/x86_64(client)


Issue Links:
Related
is related to LU-4556 speed up sanity-lfsck and sanity-scru... Resolved
Rank (Obsolete): 13100

 Description   

sanity-scrub test 11 failed as follows:

CMD: client-12vm3 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 3 -r
Started LFSCK on the device lustre-MDT0000.
CMD: client-12vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.oi_scrub
CMD: client-12vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.oi_scrub
CMD: client-12vm3 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -r
Started LFSCK on the device lustre-MDT0000.
CMD: client-12vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.oi_scrub
CMD: client-12vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.oi_scrub
 sanity-scrub test_11: @@@@@@ FAIL: (8) Expect 0 objects skipped on mds1, but got 14 

Maloo report: https://maloo.whamcloud.com/test_sets/764fa658-aa3d-11e3-b4b1-52540035b04c



 Comments   
Comment by Jian Yu [ 13/Mar/14 ]

Here is a patch trying to reproduce the failure on Lustre b2_5 build #41: http://review.whamcloud.com/9645

Comment by nasf (Inactive) [ 14/Mar/14 ]

I suspect that there are some new files created during test_14 repeat the scrub, so the test scripts should be improved to check the total checked the objects (files) to guarantee the compare "noscrub" between two cycle OI scrub are the same.

Comment by nasf (Inactive) [ 16/Mar/14 ]

I will enhance the test scripts in the task of optimising the sanity-lfsck/sanity-scrub.

Comment by nasf (Inactive) [ 26/Mar/14 ]

The fixing is contained in the patch:
http://review.whamcloud.com/#/c/9704/

Comment by nasf (Inactive) [ 30/Apr/14 ]

The patch has been landed to master.

Comment by Andreas Dilger [ 18/Jun/14 ]

Fix was landed as part of LU-4556.

Comment by Jian Yu [ 02/Sep/14 ]

Lustre client build: https://build.hpdd.intel.com/job/lustre-b2_4/73 (2.4.3)
Lustre server build: https://build.hpdd.intel.com/job/lustre-b2_5/86 (2.5.3 RC1)

The similar failure occurred:

CMD: shadow-13vm12 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -r
Started LFSCK on the device lustre-MDT0000.
CMD: shadow-13vm12 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.oi_scrub
CMD: shadow-13vm12 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.oi_scrub
CMD: shadow-13vm12 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -r
Started LFSCK on the device lustre-MDT0000.
CMD: shadow-13vm12 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.oi_scrub
CMD: shadow-13vm12 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.oi_scrub
 sanity-scrub test_11: @@@@@@ FAIL: (8) Expect 0 objects skipped, but got 14 

Maloo report: https://testing.hpdd.intel.com/test_sets/1a2c41ce-31dd-11e4-bd90-5254006e85c2

Generated at Sat Feb 10 01:45:38 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.