[LU-12730] sanity test_807: f807.sanity expected blocks: 4103, got: 2051 - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.14.0
Affects Version/s: None
Labels:
None

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

This issue relates to the following test suite run:
https://testing.whamcloud.com/test_sets/b6d78184-c95c-11e9-97d5-52540065bddc

test_807 failed with the following error:

/mnt/lustre/f807.sanity expected blocks: 4103, got: 2051

This only started failing on 2019-08-27 (10 of 661 runs as of today), so is likely due to one of the 14 patches landed on that day, but it is not obvious which one might have caused the problem. It seems like there may be some race condition in the patch, if the write/close from the client does not complete before the SOM update is run? It seems possible that the client has not flushed the data before the file is closed, or possibly the files are not being processed by llsom_sync at all because they are not REC_MIN_AGE=600s old and not enough FIDs are cached yet. It may be that using llsom_sync -a5 to the test would avoid this?

Note that there are also a large number of tests failing with "/mnt/lustre/f807.sanity expected size: , got: 2097152", but those are only for patch https://review.whamcloud.com/35977 "LU-10467 lustre: use wait_event_idle_timeout() as appropriate" and child patches, so that is caused by a regression in that patch.

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity test_807 - /mnt/lustre/f807.sanity expected blocks: 4103, got: 2051

Attachments

Issue Links

mentioned in: Page No Confluence page found with the given URL.

Activity

[LU-12730] sanity test_807: f807.sanity expected blocks: 4103, got: 2051

Peter Jones added a comment - 01/May/20 4:59 AM

Landed for 2.14

Peter Jones added a comment - 01/May/20 4:59 AM Landed for 2.14

Gerrit Updater added a comment - 01/May/20 4:26 AM

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37146/
Subject: ~~LU-12730~~ tests: sync file before checking LSOM
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 391f5ea858aebbee2ae1beacfb89a1b2a761e9d6

Gerrit Updater added a comment - 01/May/20 4:26 AM Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37146/ Subject: LU-12730 tests: sync file before checking LSOM Project: fs/lustre-release Branch: master Current Patch Set: Commit: 391f5ea858aebbee2ae1beacfb89a1b2a761e9d6

Chris Horn added a comment - 24/Apr/20 4:26 PM

+1 on master https://testing.whamcloud.com/test_sessions/38d6579b-57e2-43f6-826d-fa87039e565d

Chris Horn added a comment - 24/Apr/20 4:26 PM +1 on master https://testing.whamcloud.com/test_sessions/38d6579b-57e2-43f6-826d-fa87039e565d

Gerrit Updater added a comment - 06/Jan/20 4:29 PM

Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/37146
Subject: ~~LU-12730~~ tests: sync file before checking LSOM
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4463d610a5c56ac3d2b7828370a3da5184e2c37f

Gerrit Updater added a comment - 06/Jan/20 4:29 PM Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/37146 Subject: LU-12730 tests: sync file before checking LSOM Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 4463d610a5c56ac3d2b7828370a3da5184e2c37f

Sebastien Buisson added a comment - 20/Dec/19 4:01 PM

Patch https://review.whamcloud.com/36990 hits this failure a lot, but I cannot really see the relationship between the patch and the purpose of the test.

In fact, looking closely at the test, I do not really get why expected blocks is the value read from 'stat -c %b file'. The test writes bs from each client, so the expected value is simply bs times the number of clients.
If the number of blocks returned by stat is something else, maybe this is a different issue than what the test is supposed to exercise?

Sebastien Buisson added a comment - 20/Dec/19 4:01 PM Patch https://review.whamcloud.com/36990 hits this failure a lot, but I cannot really see the relationship between the patch and the purpose of the test. In fact, looking closely at the test, I do not really get why expected blocks is the value read from 'stat -c %b file'. The test writes bs from each client, so the expected value is simply bs times the number of clients. If the number of blocks returned by stat is something else, maybe this is a different issue than what the test is supposed to exercise?

sanity test_807: f807.sanity expected blocks: 4103, got: 2051

Details

Description

Attachments

Issue Links

Activity

People

Dates