[LU-17046] sanity-quota test_1g: user write success, but expect EDQUOT Created: 21/Aug/23  Updated: 30/Nov/23  Resolved: 30/Nov/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Maloo Assignee: James A Simmons
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-13917 sanity-quota test_1c: user write succ... Open
is related to LU-15257 sanity-quota: test_1g wait_delete_com... Open
is related to LU-13810 Check OST pool quota hard limit at fi... Resolved
is related to LU-14279 sanity-quota test_3b: write success, ... Resolved
is related to LU-17191 sanity-quota test_1b, 1d, 1f, 1i: FAI... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Feng Lei <flei@ddn.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/199b4874-fb26-4062-b506-70ca71268b02

test_1g failed with the following error:

user write success, but expect EDQUOT

Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/97057 - 4.18.0-477.15.1.el8_8.x86_64
servers: https://build.whamcloud.com/job/lustre-reviews/97057 - 4.18.0-477.15.1.el8_lustre.x86_64

<<Please provide additional information about the failure here>>

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity-quota test_1g - user write success, but expect EDQUOT



 Comments   
Comment by James A Simmons [ 27/Oct/23 ]

Note this was opened Aug 23 and the Xarray patch landed Sept 6. So this means the problem existed before the Xarray patch landed.

Comment by Gerrit Updater [ 05/Nov/23 ]

"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52995
Subject: LU-17046 tests: increase sleep for sanity-quota 1g
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f3e49e5c530451426a0a2505b9cd16ba6662e4f4

Comment by James A Simmons [ 06/Nov/23 ]

Tracked down the commit that is causing this failure.

commit 5176c0494338de34be9b2bc35e55f91daaab67a6
("LU-13810 tests: increase limit for 1g")

Further test suggest this bug existed before this patch but this patch really makes it come out.

Comment by Sergey Cheremencev [ 08/Nov/23 ]

From https://testing.whamcloud.com/test_sets/199b4874-fb26-4062-b506-70ca71268b02

 [dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d1g.sanity-quota/f1g.sanity-quota-0] [count=10] [seek=10]
dd: error writing '/mnt/lustre/d1g.sanity-quota/f1g.sanity-quota-0': Disk quota exceeded
8+0 records in
7+0 records out

Due to preacquire requests it gets EDQUOT earlier and only 7MB instead of 10M. Despite there is no logs about changing edquot from 1 to 0(somehow there is nothing about that and I believe this part of logs missed), I think as was written only 17MB, not 20, OSTs could send release requests causing to change edquot flag from 1 to 0. Thus when it writes the rest 2MB(there are 2 OSTs) it doesn't fail as there is no EDQUOT flag anymore and finally it writes just 19(17+2) MB instead of 22(20+2).

Comment by Gerrit Updater [ 08/Nov/23 ]

"Sergey Cheremencev <scherementsev@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53038
Subject: LU-17046 tests: fix write success in 1g
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 072e51154c13e7abf035a75018e52dc94c04806c

Comment by Sergey Cheremencev [ 08/Nov/23 ]

There is a lot of failures of sanity-quota_1g last time. However, the most of them have different reasons. https://review.whamcloud.com/c/fs/lustre-release/+/53038 is only to fix the issue from the description. Other failures could be fixed by https://review.whamcloud.com/c/fs/lustre-release/+/52713 (LU-17191). If after landing these both patches sanity-quota_1g will be still failing, new failures should be worked out in a separate ticket.

Comment by Gerrit Updater [ 30/Nov/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/53038/
Subject: LU-17046 tests: fix write success in 1g
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: de352465eb6a02aeb20357208c54e903585e12e3

Comment by James A Simmons [ 30/Nov/23 ]

Thank you Oleg.

Generated at Sat Feb 10 03:32:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.