[LU-2098] still haven't managed to acquire quota space from the quota master after 10 retries (err=0, rc=0) Created: 05/Oct/12  Updated: 24/Feb/13  Resolved: 24/Feb/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.2
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: adam contois (Inactive) Assignee: Niu Yawei (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Environment:

CentOS 5.7


Severity: 3
Rank (Obsolete): 4382

 Description   

***This is for WhamCloud Level 3 support purchased through Penguin Computing.

We have a Lustre 2.1.2 file
system that has quotas. Previously, these quotas were not set to start by
default and required us to run "lfs quotaon" to activate them after every
downtime. To change this, I ran the command "tunefs.lustre --param
ost.quota_type=u" on all of our OSTs, and "tunefs.lustre --param
mdd.quota_type=u" on our MDT during our most recent downtime. The next time
we mounted the file system, Lustre quotas were properly started and our
previously defined quotas were still available.

However, since the downtime we have been seeing these messages in the
/var/log/messages of the OSSes:
Oct 3 10:28:44 oss01 kernel: Lustre:
18537:0:(quota_interface.c:532:quota_chk_acq_common()) still haven't
managed to acquire quota space from the quota master after 10 retries
(err=0, rc=0)
Oct 3 10:29:15 oss01 kernel: Lustre:
24295:0:(quota_interface.c:532:quota_chk_acq_common()) still haven't
managed to acquire quota space from the quota master after 10 retries
(err=0, rc=0)
Oct 3 10:29:49 oss01 kernel: Lustre:
18554:0:(quota_interface.c:532:quota_chk_acq_common()) still haven't
managed to acquire quota space from the quota master after 10 retries
(err=0, rc=0)

These messages don't appear to affect the system's ability to work with
Lustre quotas, but they did not appear in the logs until after this change
to the parameters of the OSTs. Additionally, nothing appears on the MDS's
syslog related to anything wrong with the "quota master" or anything
involving user quotas.

Any suggestions that you could give to correct this problem would be
appreciated and please let me know if you need any additional information.



 Comments   
Comment by Peter Jones [ 05/Oct/12 ]

Niu

Could you please assist with this ticket?

Thanks

Peter

Comment by Niu Yawei (Inactive) [ 07/Oct/12 ]

We were seeing this problem in LU-1438, and finally we found the roos cause is LU-1720. There is patch available for b2_1: http://review.whamcloud.com/#change,3600, it's not landed yet because the patch needs be refreshed to pass the maloo test.

I suspect this one is also caused by LU-1720, you can verify that by setting quota limit over 4TB (for each single OST), if it fails (like LU-1720), then they should be same problem.

Comment by Peter Jones [ 24/Feb/13 ]

This is suspected to be a duplicate of LU-1720. The mentioned fix was included in the 2.1.4 release. If this issue still occurs running that release then please speak up and we will reopen this ticket.

Generated at Sat Feb 10 01:22:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.