[LU-13321] sanity: 160f failed "mds3: user cl6 index expected 0 + 2, but is 0" Created: 04/Mar/20  Updated: 12/Mar/21  Resolved: 03/Apr/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Maloo Assignee: Andreas Dilger
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-12506 Client unable to mount filesystem wit... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Lai Siyao <lai.siyao@whamcloud.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/ddd8bcd4-cb29-447c-8d67-c749d3fc38f1



 Comments   
Comment by Gerrit Updater [ 03/Apr/20 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38129
Subject: LU-13321 tests: force even DNE file distribution
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9b8d21cde5f674588eb9f92e14f980064a63841e

Comment by Andreas Dilger [ 03/Apr/20 ]

It looks like this was introduced by patch https://review.whamcloud.com/36775 "LU-11025 dne: introduce new directory hash type: 'crush'" causing the directory distribution to be uneven and causing intermittent failures for sanity test_160f, about 15% (1/8) of the tests are failing.

This was hit repeatedly during the development of that patch, but with enough retesting it was made to pass. It is still a cause of test failures for the other patches for the LU-11025 ticket, but should be fixed by the patch here.

Comment by Gerrit Updater [ 03/Apr/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38129/
Subject: LU-13321 tests: force even DNE file distribution
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 489afbe69d5b7a867e44f6c2364513e660cf862b

Comment by Andreas Dilger [ 17/Jul/20 ]

This may need to be replaced again with the "all_char" hash instead of "fnv_1a_64" since there are still errors when running on a large number of MDTs (e.g. 20+). The hash function is not perfectly uniform, only statistically so, which results in MDT imbalances when the stripe count is higher (e.g. from my 12-MDT test run for v6 of patch https://review.whamcloud.com/38058 "LU-12506 tests: clean up MDT name generation":
https://testing.whamcloud.com/test_sets/1e911540-e60d-4086-bba6-fa975354cb9d

mds4: user cl6 index expected 0 + 2, but is 0
mds6: user cl8 index expected 6 + 2, but is 6
mds2: user cl10 index expected 6 + 2, but is 6

The "all_chars" hash is perfectly uniform (essentially round-robin) as long as the filenames are generated in a uniform pattern (e.g. counter).

Generated at Sat Feb 10 03:00:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.