[LU-6562] sanity-scrub looks for wrong parameters in osd-ldisfs.*MDT*.oi_scrub Created: 04/May/15  Updated: 17/Dec/15  Resolved: 17/Dec/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Nunez (Inactive) Assignee: James Nunez (Inactive)
Resolution: Duplicate Votes: 0
Labels: patch, tests

Issue Links:
Blocker
is blocked by LU-6861 sanity-scrub test 4a and 4b fail: Aut... Resolved
Related
is related to LU-6861 sanity-scrub test 4a and 4b fail: Aut... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

In sanity-scrub, tests 4a, 4b, and 4c look for ‘sf_items_updated_prior’ in osd-ldiskfs.*.oi_scrub on each MDT, but there is no ‘sf_items_updated_prior’. These tests should be looking for the value of ‘prior_updated’.

I will upload a patch for this.



 Comments   
Comment by Gerrit Updater [ 04/May/15 ]

James Nunez (james.a.nunez@intel.com) uploaded a new patch: http://review.whamcloud.com/14660
Subject: LU-6562 tests: sanity-scrub parse 'prior_updated'
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6ffdef88bb57e2e26847e317504dc4f310d744dc

Comment by James Nunez (Inactive) [ 12/May/15 ]

Now that I've changed tests 4b and 4c to look for "prior_updated" in the osd-ldiskfs.*.oi_scrub parameters, these tests reliably fail in a DNE set up. I can reproduce this on my cluster with 2 MDSs with 2 MDTs each and in our autotest VMs:
https://testing.hpdd.intel.com/test_sets/ae20b066-f2d2-11e4-aad2-5254006e85c2

The failure is:

FAIL: (12) Auto trigger full scrub unexpectedly

I've printed the values of prior_updated before (updated0) and after the scrub_check_* routines and, for 4b, I see:

updated0[1] = 3
updated0[2] = 0
updated0[3] = 0
updated0[4] = 0
updated1[1] = 4
updated1[2] = 0
sanity-scrub test_4b: @@@@@@ FAIL: (12) Auto trigger full scrub unexpectedly 

The test that fails is the check that updated0[i] is strictly less than updated1[i].

Comment by James Nunez (Inactive) [ 27/May/15 ]

Di or Fan Yong,
Should this test be changed to look for full scrub on only the first MDT or should the check be changed from less than to less than or equal to? Or is there an actual error and the test should remain as it is?

Thank you.

Comment by nasf (Inactive) [ 29/May/15 ]

updated0[1] = 3
updated0[2] = 0
updated0[3] = 0
updated0[4] = 0
updated1[1] = 4
updated1[2] = 0
sanity-scrub test_4b: @@@@@@ FAIL: (12) Auto trigger full scrub unexpectedly

That is unexpected. For test_4b, there should be OI scrub updating on every MDT, so update0[x] should not be zero. Your scripts fixing about replacing "sf_items_updated_prior" with "updated_prior" is right. We need more investigation for the check failure.

Comment by James Nunez (Inactive) [ 29/May/15 ]

Logs for a sanity-scrub test 4b and 4c failure is at: https://testing.hpdd.intel.com/test_sets/ae20b066-f2d2-11e4-aad2-5254006e85c2

Comment by James Nunez (Inactive) [ 16/Jul/15 ]

Opened LU-6861 to track the bug that this patch uncovered.

Comment by James Nunez (Inactive) [ 27/Oct/15 ]

Patch http://review.whamcloud.com/#/c/14660/ was abandoned because the modification was incorporated into patch http://review.whamcloud.com/#/c/16951/ for ticket LU-6861. This ticket can be closed when patch 16951 lands.

Comment by Andreas Dilger [ 17/Dec/15 ]

Closing this ticket since it will be fixed by LU-6861 patch http://review.whamcloud.com/16951 and there isn't much value keeping this ticket open anymore.

Generated at Sat Feb 10 02:01:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.