[LU-15740] runtests test_1: 'Space not all freed Created: 13/Apr/22  Updated: 06/Dec/23  Resolved: 19/Oct/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0, Lustre 2.15.4

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Andreas Dilger
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by LU-17339 Interop runtests test_1: FAIL: Space ... Open
Related
is related to LU-12807 runtests test 1 fails with ''Space no... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Cliff White <cwhite@whamcloud.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/6363612b-d4a1-4780-9248-a54e1bf4bfea

test_1 failed with the following error:

'Space not all freed: now 8932kB, was 8788kB'

May be a repeat of LU-10106, LU-12807, LU-12579

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
runtests test_1 - 'Space not all freed: now 8932kB, was 8788kB'



 Comments   
Comment by Andreas Dilger [ 13/Apr/22 ]

It seems unlikely that this is a straight duplicate of LU-12807, since that was dealing with a difference of only 64KB, while this is 144KB difference or more.

There have been 15 similar failures in the past 4 weeks, but it goes back a long time and it could just be a slow bleed from 64KB to 144KB of space usage.

Comment by Gerrit Updater [ 13/Apr/22 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47065
Subject: LU-15740 tests: add more stats to runtests
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a9c290bc420028de274ccfc2c52363bde6208f49

Comment by Gerrit Updater [ 05/May/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47065/
Subject: LU-15740 tests: add more stats to runtests
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 156073a2914145e2b029a658aeec04a54c524e5b

Comment by Andreas Dilger [ 26/Mar/23 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/50419
Subject: LU-15740 tests: scale fs_log_size by OSTCOUNT
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 56edc654cc035af4e5f60c821892b93655ed5c5e

Comment by Andreas Dilger [ 26/Mar/23 ]

In all of the cases where fs_log_size() is checked (runtests included) the actual space compared was based on OST space usage. However, it looks like runtests was failing because fs_log_size() was scaling the returned value based on the MDT count, and not on the OST count. It looks like the main "leakage" of space on the OSTs was only due to blocks being allocated to the object directories O/<seq>/dN, and this was very obvious when runtests was run in isolation and failed repeatedly, but passed if it was run after other tests.

Change fs_log_size() to scale the "slop" by OSTCOUNT rather than MDTCOUNT.

Comment by Gerrit Updater [ 04/Apr/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50419/
Subject: LU-15740 tests: scale fs_log_size by OSTCOUNT
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: fabec6f2cb39950a2f208567dac716e21880fa9f

Comment by Gerrit Updater [ 09/Jul/23 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51606
Subject: LU-15740 tests: scale fs_log_size by OSTCOUNT
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 3d2bcc99c9b3deb296f2659f65daedea710dec10

Comment by Gerrit Updater [ 19/Oct/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51606/
Subject: LU-15740 tests: scale fs_log_size by OSTCOUNT
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 34e1409cad412086d509349f64dc9d77911d2fb8

Generated at Sat Feb 10 03:20:53 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.