[LU-12829] sanity-quota test 61 fails with 'write succeed, expect EDQUOT' Created: 01/Oct/19  Updated: 18/Dec/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.13.0, Lustre 2.12.3, Lustre 2.14.0, Lustre 2.12.4
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Nunez (Inactive) Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-14299 sanity-quota test 61 fails with 'writ... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

sanity-quota test_61 fails with 'write succeed, expect EDQUOT'. This test seems to fail once or twice a month on master and/or on b2_12 and started in March or earlier of this year.

sanity-quota test 61 fails the following call to the test_default_quota() routine

3274         test_default_quota "-g" "data"

and fails the following code in the test_default_quota() routine

3192         log "Test out of quota"
3193         # flush cache, ensure noquota flag is set on client
3194         cancel_lru_locks osc
3195         cancel_lru_locks mdc
3196         sync; sync_all_data || true
3197         if [ $qpool == "data" ]; then
3198                 $RUNAS $DD of=$TESTFILE count=$((LIMIT*2 >> 10)) oflag=sync &&
3199                         quota_error $qtype $qid "write succeed, expect EDQUOT"

Looking at the client test_log for the failure at https://testing.whamcloud.com/test_sets/1e9bb45a-d2ce-11e9-a25b-52540065bddc, we see

set to use default quota
set default quota
get default quota
Disk default grp quota:
     Filesystem   bquota  blimit  bgrace   iquota  ilimit  igrace
    /mnt/lustre  20480   20480       0      0       0  604800
Test not out of quota
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d61.sanity-quota/f61.sanity-quota-0] [count=10] [oflag=sync]
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 3.08127 s, 3.4 MB/s
Test out of quota
CMD: trevis-12vm4,trevis-12vm5 lctl set_param -n os[cd]*.*MDT*.force_sync=1
CMD: trevis-12vm3 lctl set_param -n osd*.*OS*.force_sync=1
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d61.sanity-quota/f61.sanity-quota-0] [count=40] [oflag=sync]
40+0 records in
40+0 records out
41943040 bytes (42 MB) copied, 10.5095 s, 4.0 MB/s
 sanity-quota test_61: @@@@@@ FAIL: write succeed, expect EDQUOT 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:6115:error()
  = /usr/lib64/lustre/tests/sanity-quota.sh:152:quota_error()
  = /usr/lib64/lustre/tests/sanity-quota.sh:3199:test_default_quota()
  = /usr/lib64/lustre/tests/sanity-quota.sh:3274:test_61()

Looking at the console logs, there is nothing obviously wrong and no error messages.

Logs for past failures are at
https://testing.whamcloud.com/test_sets/c4e0eb7a-e019-11e9-9874-52540065bddc
https://testing.whamcloud.com/test_sets/f302b6ae-d548-11e9-90ad-52540065bddc
https://testing.whamcloud.com/test_sets/32fd1f46-e2d3-11e9-a197-52540065bddc



 Comments   
Comment by Jian Yu [ 10/May/20 ]

+1 on master branch:
https://testing.whamcloud.com/test_sets/bb3e6a2c-aff3-4813-b0c9-2b019afb6233

Comment by Andreas Dilger [ 27/Nov/20 ]

I was looking at stats for this test because of patch https://review.whamcloud.com/39873 "LU-13952 quota: default OST Pool Quotas", and it has failed 24/209 = 11% of runs in the past 4 weeks in the full session, excluding the many skipped review runs because it is listed in EXCEPT_SLOW.

Comment by Gerrit Updater [ 27/Nov/20 ]

Sergey Cheremencev (sergey.cheremencev@hpe.com) uploaded a new patch: https://review.whamcloud.com/40784
Subject: LU-12829 tests: check dflt quota with dirty_mb 1M
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4cfde292838c0eb14754bbdd89f7d4671d0cd9fb

Generated at Sat Feb 10 02:56:01 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.