[LU-12868] sanity-quota test 65 100% failure Created: 16/Oct/19  Updated: 18/Oct/19

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.13.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Oleg Drokin Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-12694 make "lfs quota" display correct grou... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

In my testing sanity-quota test 65 fails 100% of the time lately.

== sanity-quota test 65: Check lfs quota result ====================================================== 22:37:17 (1571193437)
Waiting for local destroys to complete
Creating test directory
fail_val=0
fail_loc=0
debug=+quota
debug=+quota
Write...
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d65.sanity-quota/f65.sanity-quota-0] [count=10]
dd: error writing '/mnt/lustre/d65.sanity-quota/f65.sanity-quota-0': Disk quota exceeded
8+0 records in
7+0 records out
7340032 bytes (7.3 MB) copied, 0.0647576 s, 113 MB/s
 sanity-quota test_65: @@@@@@ FAIL: failed to write 

sample link:
http://testing.linuxhacker.ru:3333/lustre-reports/3717/testresults/sanity-quota-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/

I suspect somewhere we forget to clear quota?

I run some queries and this test actually never passed in my testing (And I apparently missed that even though there was a clear warning from the testing env).



 Comments   
Comment by Gerrit Updater [ 16/Oct/19 ]

Oleg Drokin (green@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36462
Subject: LU-12868 test a theory
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 023e4ee4512d3c159898bf2e8cc02263ae320c7b

Comment by Oleg Drokin [ 16/Oct/19 ]

hm, ok, so the test patch shows the quota is cleared which makes the whole failure somewhat mysterious.

Hongchao, can you please take a look?

http://testing.linuxhacker.ru:3333/lustre-reports/3721/testresults/sanity-quota-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/ is probably the good one to look at

Comment by Hongchao Zhang [ 17/Oct/19 ]

sorry, I can't open the URL, it shows "Unable to round-trip http request to upstream: dial tcp 73.108.203.87:3333: i/o timeout".

The quota will be cleared at the end of each test by "cleanup_quota_test", which means there will be no quota limits on IDs.
"Disk quota exceeded" error means there is quota limit to be set on the user "60000", could you please try your patch "36462"
on your test environment to print the quota setting? Thanks!

btw, I searched on Maloo and found there is no such error in it.

Comment by Oleg Drokin [ 18/Oct/19 ]

ah, yes, I heard that my node is somehow not accessible from China. I wonder how to best get you debug logs from a test run? the issue only happens on my test systems but not on maloo. possibly because I have exclusion list, I always exclude these santy-quota tests: 2, 4a and 63

Generated at Sat Feb 10 02:56:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.