[LU-13481] sanity test_33h: MDT index mismatch 5 times Created: 23/Apr/20  Updated: 30/Sep/22  Resolved: 19/Oct/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0
Fix Version/s: Lustre 2.14.0

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-15692 performance regressions for files in ... Resolved
is related to LU-15720 imbalanced file creation in 'crush' s... Resolved
is related to LU-16198 sanity test_33hh: MDT index match 49/... Resolved
is related to LU-11025 DNE3: directory restripe Resolved
is related to LU-12867 DNE3: new DNE2 hash function to handl... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/f4917f7e-6cba-48f7-8103-9fffc6e24f2b

test_33h failed with the following error:

striped dir -i1 -c4 -H crush /mnt/lustre/d33h.sanity
/mnt/lustre/d33h.sanity/.f33h.sanity.oggiyf MDT index mismatch 2 != 1
/mnt/lustre/d33h.sanity/.f33h.sanity.2Y94S3 MDT index mismatch 2 != 1
/mnt/lustre/d33h.sanity/.f33h.sanity.q540W9 MDT index mismatch 2 != 1
/mnt/lustre/d33h.sanity/.f33h.sanity.8559LC MDT index mismatch 2 != 3
/mnt/lustre/d33h.sanity/.f33h.sanity.ayofxy MDT index mismatch 2 != 0
 sanity test_33h: @@@@@@ FAIL: MDT index mismatch 5 times

This is an intermittent test failure (4 failures in 4 weeks). It may just be a statistical issue based on the distribution of the temp file names.

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity test_33h - MDT index mismatch 5 times



 Comments   
Comment by Lai Siyao [ 25/Apr/20 ]

There are 4 failures in 329 runs, it's about 1.2% failure rate.

Comment by Gerrit Updater [ 08/May/20 ]

Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38539
Subject: LU-13481 dne: improve temp file name check
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 26f5b4c262e53432544b3ccb7394e6d185cb9ac3

Comment by Gerrit Updater [ 20/May/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38539/
Subject: LU-13481 dne: improve temp file name check
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 51e57496838381b5d7fdecf228e042e9660c21b6

Comment by Peter Jones [ 20/May/20 ]

Landed for 2.14

Comment by Chris Horn [ 29/Jun/20 ]

Looks like I hit this on more recent master: https://testing.whamcloud.com/test_sets/6bac8393-5101-446d-b8eb-27b0c8b8bbac

Comment by Lai Siyao [ 30/Jun/20 ]

There are 3 failures in the last 4 weeks, but it's strange they all happened on Jun 24. I'll see whether testing with more files can lower the failure rate.

Comment by Gerrit Updater [ 30/Jun/20 ]

Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39219
Subject: LU-13481 test: run sanity 33h with more files
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f408355d6179a3cbef9013433b38ffc4b7feb0f9

Comment by James Nunez (Inactive) [ 21/Aug/20 ]

Reopening because we're still seeing this issue on master; https://testing.whamcloud.com/test_sets/ce1a5f02-1d88-4f87-a537-3402157ec709 .

Comment by Gerrit Updater [ 01/Sep/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39219/
Subject: LU-13481 test: run sanity 33h with more files
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: f4d243ae8bf1715538d1186f5978412f68dd5af1

Comment by Peter Jones [ 19/Oct/20 ]

Seems to be fixed

Generated at Sat Feb 10 03:01:38 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.