[LU-5247] Strange quota limits on OSTs Created: 24/Jun/14  Updated: 07/Aug/14  Resolved: 07/Aug/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.2
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Li Xi (Inactive) Assignee: Niu Yawei (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None

Attachments: Text File lustre4_lfs.quota_UID3018_20140626.log     Text File lustre4_lfs.quota_UID654.657_20140626.log     File lustre4_quota.debug.UID3018_20140626.tar.gz     Text File lustre5_lfs.quota_UID652_20140626.log     File lustre5_qmt_glb-usr.grp.tar.gz     File lustre5_quota.debug.UID652_20140626.tar.gz     File lustre5_quota.debug.UID652_20140628_1.tar.gz     File lustre5_quota.debug.UID652_OSS_20140626_2.tar.gz     Text File lustre5_quota.debug.UID652_other_20140626_2.tar.gz     File lustre5_quota_UID2165_20140707.tar.gz     File lustre5_quota_slave.info.tar.gz     File lustre5_quota_slave.limit.tar.gz     Zip Archive oss_quota_slave.zip     Text File qmt_dt-0x0_glb-grp.log     Text File qmt_dt-0x0_glb-usr.log     Text File qmt_dt-0x0_glb-usr.log    
Severity: 3
Rank (Obsolete): 14638

 Description   

We are seeing some strange quota limits on OSTs as following:

[root@nmds04 home]# lfs quota -u lect02 -v /root/lustre
Disk quotas for user lect02 (uid 3018):
Filesystem kbytes quota limit grace files quota limit grace
/root/lustre 16 0 1000000000 - 4 0 0 -
lustre4-MDT0000_UUID
4 - 0 - 4 - 0 -
lustre4-OST0000_UUID
0 - 165167900 - - - - -
lustre4-OST0001_UUID
0 - 63031212 - - - - -
lustre4-OST0002_UUID
0 - 53832788 - - - - -
...
lustre4-OST0020_UUID
4* - 4 - - - - -
lustre4-OST0021_UUID
0 - 91758376 - - - - -
...
[root@nmds04 home]# lfs quota -u lect02 -v /root/lustre/ | grep -A1 OST | grep -v OST | awk '

{ SUM += $3 }

END

{ print SUM }

'
5572927984

First, some OSTs have much bigger granted limits than their usages. And second, the sum of limits on OSTs exceeds the total limit. What is more, some user which do do have have quota limits can not use more spaces unless the quota is turn off manually.



 Comments   
Comment by Li Xi (Inactive) [ 24/Jun/14 ]

We are currently in a maintainance windown, and trying to fix/walk around this problem instantly. Otherwise, disabling quota feature would be the last choice.

We tried rebooting the OSTs and MDTs, runing 'tune2fs -O ^quota/quota', and doing 'lctl conf_param $FSNAME.quota.ost=none; lctl conf_param $FSNAME.quota.ost=ug;'. None of those attempts succeeded.

I am wondering whether the files under quota_slave directories are broken. Is it safe to remove all of those directories offline and restart the Lustre then? I tried that in an test environment, and nothing bad happened. But we'd like to make sure it won't cause any problem of losing data or breaking down the system. Is there any other good idea about how to fix/walk around this problem?

Thanks in advance!

Comment by Peter Jones [ 24/Jun/14 ]

Niu

Can you please advise?

Thanks

Peter

Comment by Johann Lombardi (Inactive) [ 24/Jun/14 ]

Could you please dump the limits on all the OSTs by running the following command?

# lctl get_param osd*.*.quota_slave.limit*

And also dumps limits on the qmt so that we can compare it.

You could also try to force reintegration by running on each OSS:

# lctl set_param osd*.*.quota_slave.force_reint=1

As for removing the slave copies of the indexes on the OST, it was designed to work, but it does not seem to be tested in sanity-quota. Niu, could you please run additional tests to make sure it works well?

Comment by Li Xi (Inactive) [ 24/Jun/14 ]

Hi Johann,

Thanks for helping. I've attached the output of 'lctl get_param osd*..quota_slave.limit'

However, nothing changes after running 'lctl set_param osd*.*.quota_slave.force_reint=1' on the first OSS. The limits are still strange.

[root@nmds06 ~]# lfs quota -u toyasuda -v /root/lustre/
Disk quotas for user toyasuda (uid 652):
Filesystem kbytes quota limit grace files quota limit grace
/root/lustre/ 46120740 0 0 - 5076 0 0 -
lustre5-MDT0000_UUID
2692 - 0 - 5076 - 0 -
lustre5-OST0000_UUID
5164 - 4194304 - - - - -
lustre5-OST0001_UUID
118936* - 4 - - - - -
lustre5-OST0002_UUID
1093400 - 3670132 - - - - -
lustre5-OST0003_UUID
26120* - 4 - - - - -
lustre5-OST0004_UUID
3204* - 4 - - - - -
lustre5-OST0005_UUID
11041916* - 4194304 - - - - -
...

Comment by Zhenyu Xu [ 25/Jun/14 ]

Strange thing to me is why uid 3018 quota info is not in qmt_dt-0x0_glb-usr.log?

Comment by Mitsuhiro Nishizawa [ 25/Jun/14 ]

Hi Zhenyu,
The reason is because these output was captured from different file system. There are 3 file systems and all are showing the same problem.
lect02 (uid 3018) is for lustre4 file system and qmt_dt-0x0_glb-usr.log was captured from lustre5 file system. toyasuda (uid 652) is for lustre5.

Comment by Zhenyu Xu [ 25/Jun/14 ]

While uid 652 quota info does no appear in the qmt_dt-0x0_glb-usr.log also.

Comment by Mitsuhiro Nishizawa [ 25/Jun/14 ]

I checked qmt_dt-0x0_glb-usr.log and the current quota_slave.limit on QMT for lustre5. It looks like qmt_dt-0x0_glb-usr.log is not from lustre5, sorry.
I captured quota_slave.limit again from QMT and quota_slave for file system lustre5 (lustre5_quota_slave.limit.tar.gz).
uid 652 does not appear in either log. uid 652 does not have any quota limitation and this might be a reason why it does not appear, I think.

Comment by Mitsuhiro Nishizawa [ 25/Jun/14 ]

quota_slave.limit from QMT, quota_slave for lustre5 file system

Comment by Mitsuhiro Nishizawa [ 25/Jun/14 ]

Sorry, QMT log

Comment by Mitsuhiro Nishizawa [ 25/Jun/14 ]

Sorry, many times, this one is the latest QMT log for lustre5 (captured at the same time as lustre5_quota_slave.limit.tar.gz). I uploaded different file mistakenly...

Comment by Mitsuhiro Nishizawa [ 25/Jun/14 ]

Hello, can we expect an action plan (or if we can safely remove slave copies) within a few hours? The system is currently in the maintenance window and we can do offline work, but it will over at the end of today. The customer need to determine if they continue to wait for our response or they put the system back to service without quota. This is difficult situation for them to decide. Please share us with the latest information and expectation to go. Regards,

Comment by Zhenyu Xu [ 25/Jun/14 ]

There is some id 's granted value exceeds its hardlimit

# grep id lustre5_qmt_dt-0x0_glb-usr.log | awk '{print $3}' > ids
# for i in `cat ids` ; do echo $i; grep "id:.*$i" -A1 *.limit | grep -v "id" | awk 'BEGIN {FS=",[ \t]*|[ \t]+"}  {SUM += $9} END { if ($5 < SUM) {print $5, SUM, ":granted over hard"}}'; done

0
1000000000 601956847608 :granted over hard
3348
1000000000 9773350268 :granted over hard
3094
5000000000 24406854848 :granted over hard
3352
4000000000 62904909164 :granted over hard
3118
51000000000 1156091716280 :granted over hard
3121
1000000000 47506492972 :granted over hard
3152
100000000000 1373056117076 :granted over hard
3159
11000000000 112035373580 :granted over hard
3161
15000000000 344219288276 :granted over hard
3163
50000000000 54010129436 :granted over hard
3426
1000000000 6235934912 :granted over hard
3173
1000000000 13841161496 :granted over hard
3433
1000000000 17108669108 :granted over hard
3442
1000000000 1271819600 :granted over hard
3190
10000000000 124199124628 :granted over hard
3448
1000000000 15784903460 :granted over hard
3488
1000000000 1091150888 :granted over hard
3256
1000000000 10888793852 :granted over hard
3016
1000000000 2800668840 :granted over hard
3272
1000000000 36381814148 :granted over hard
3039
1000000000 20626585832 :granted over hard
3043
21000000000 428560476316 :granted over hard
3054
1000000000 1088043300 :granted over hard
Comment by Zhenyu Xu [ 25/Jun/14 ]

I've tried to backup quota_slave and remove it, it does work under master branch, but not for b2_4 code.

# umount /dev/sdd
# mount -t ldiskfs /dev/sdd /mnt/ost2
# mv /mnt/ost2/quota_slave/ /mnt/ost2/quota_slave.bak
# umount /dev/sdd
# mount -t lustre /dev/sdd /mnt/ost2
mount.lustre: mount /dev/sdd at /mnt/ost2 failed: File exists
#  git describe
v2_4_3_0-1-gd00e4d6

The OI know there exists slave index file in its oi mapping file but cannot find it.

LiXi said that he can remove the quota_slave and mount successfully, I don't know how.

Comment by Li Xi (Inactive) [ 25/Jun/14 ]

Hi Zhengyu,

I removed the whole directory rather than renaming it. I guess that is the difference? I got similar problem when renaming the file, so I removed it by 'rm -rf quota_slave'. BTW, I did the test on master branch of Lustre rather than 2.4.x.

Comment by Zhenyu Xu [ 25/Jun/14 ]

I think 2.4.x does not work as master, OI code on master is more mature.

Comment by Zhenyu Xu [ 25/Jun/14 ]

Would you mind setting the hard limit to 1MB for a UID, say 3348, then check whether its granted value falls, and set the limit back to check that space got released? If you use soft limit as well, decrease and restore it as well.

Comment by Mitsuhiro Nishizawa [ 26/Jun/14 ]

First, I disabled quota on the systems since the behavior is not stable and the customer need to bring the service back. However quota feature is required for them and I need to resolve this issue ASAP.

I tried to change the hard limit and found the following (I used admin users for the file systems).
For file system lustre4, the situation was improved in the case of UID 657, but not for UID 3018.
UID 657 did not have quota limitation, but "limit" was shown in 'lfs quota'. When I set '-B 1000000000', 'limit' was set to the same value as 'kbytes' on all the OSTs. Then when I set it back to '-B 0', 'limit' was set to '0' (lustre4_lfs.quota_UID654.657_20140626.log). However for UID 3018, 'limit' did not change even I did setquota (lustre4_lfs.quota_UID3018_20140626.log).
On file system lustre5, I did the same thing for UID 652 and mostly observed same behavior as the case of UID 657 above, but 'limit' value did not change on some of OSTs (lustre5_lfs.quota_UID652_20140626.log). Those OSTs are lustre5-OST0000-0004 and mounted on one of OSS.

I also captured debug log (with +trace, +quota) when I did setquota for UID 3018 on file system lustre4 and for UID652 on file system lustre5.
lustre4_quota.debug.UID3018_20140626.tar.gz
lustre5_quota.debug.UID652_20140626.tar.gz

Comment by Li Xi (Inactive) [ 26/Jun/14 ]

Hi Zhengyu,

You are right. When I removed the quota_slave directory and tried to mount the OSD again on Lustre-2.4.2. Following LBUG happened.

LDISKFS-fs (sdb3): mounted filesystem with ordered data mode. quota=on. Opts:
LustreError: 0-0: vm14-OST0000: trigger OI scrub by RPC for [0x200000005:0x1:0x0], rc = 0 [1]
LustreError: 5147:0:(qsd_lib.c:424:qsd_qtype_init()) vm14-OST0000: can't open slave index copy [0x200000006:0x20000:0x0] -115
LustreError: 5147:0:(obd_mount_server.c:1716:server_fill_super()) Unable to start targets: -115
Lustre: Failing over vm14-OST0000
LustreError: 5172:0:(osd_internal.h:752:osd_fid2oi()) ASSERTION( !fid_is_idif(fid) ) failed: [0x100000000:0x1:0x0]
LustreError: 5172:0:(osd_internal.h:752:osd_fid2oi()) LBUG
Pid: 5172, comm: OI_scrub

Call Trace:
[<ffffffffa0358895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[<ffffffffa0358e97>] lbug_with_loc+0x47/0xb0 [libcfs]
[<ffffffffa0b660f5>] __osd_oi_lookup+0x3a5/0x3b0 [osd_ldiskfs]
[<ffffffff8119d13d>] ? generic_drop_inode+0x1d/0x80
[<ffffffffa0b66174>] osd_oi_lookup+0x74/0x140 [osd_ldiskfs]
[<ffffffffa0b7afbf>] osd_scrub_exec+0x1af/0xf30 [osd_ldiskfs]
[<ffffffffa0b7c5f2>] ? osd_scrub_next+0x142/0x4b0 [osd_ldiskfs]
[<ffffffffa02e331c>] ? ldiskfs_read_inode_bitmap+0x5c/0x2c0 [ldiskfs]
[<ffffffffa0b76d4f>] osd_inode_iteration+0x1cf/0x570 [osd_ldiskfs]
[<ffffffff81051439>] ? __wake_up_common+0x59/0x90
[<ffffffffa0b7ae10>] ? osd_scrub_exec+0x0/0xf30 [osd_ldiskfs]
[<ffffffffa0b7c4b0>] ? osd_scrub_next+0x0/0x4b0 [osd_ldiskfs]
[<ffffffffa0b7932a>] osd_scrub_main+0x59a/0xd00 [osd_ldiskfs]
[<ffffffff81071a6f>] ? release_task+0x33f/0x4b0
[<ffffffff810097cc>] ? __switch_to+0x1ac/0x320
[<ffffffffa0b78d90>] ? osd_scrub_main+0x0/0xd00 [osd_ldiskfs]
[<ffffffff8100c0ca>] child_rip+0xa/0x20
[<ffffffffa0b78d90>] ? osd_scrub_main+0x0/0xd00 [osd_ldiskfs]
[<ffffffffa0b78d90>] ? osd_scrub_main+0x0/0xd00 [osd_ldiskfs]
[<ffffffff8100c0c0>] ? child_rip+0x0/0x20

Kernel panic - not syncing: LBUG
Pid: 5172, comm: OI_scrub Not tainted 2.6.32 #2
Call Trace:
[<ffffffff8150bb0c>] ? panic+0xa7/0x16f
[<ffffffffa0358eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
[<ffffffffa0b660f5>] ? __osd_oi_lookup+0x3a5/0x3b0 [osd_ldiskfs]
[<ffffffff8119d13d>] ? generic_drop_inode+0x1d/0x80
[<ffffffffa0b66174>] ? osd_oi_lookup+0x74/0x140 [osd_ldiskfs]
[<ffffffffa0b7afbf>] ? osd_scrub_exec+0x1af/0xf30 [osd_ldiskfs]
[<ffffffffa0b7c5f2>] ? osd_scrub_next+0x142/0x4b0 [osd_ldiskfs]
[<ffffffffa02e331c>] ? ldiskfs_read_inode_bitmap+0x5c/0x2c0 [ldiskfs]
[<ffffffffa0b76d4f>] ? osd_inode_iteration+0x1cf/0x570 [osd_ldiskfs]
[<ffffffff81051439>] ? __wake_up_common+0x59/0x90
[<ffffffffa0b7ae10>] ? osd_scrub_exec+0x0/0xf30 [osd_ldiskfs]
[<ffffffffa0b7c4b0>] ? osd_scrub_next+0x0/0x4b0 [osd_ldiskfs]
[<ffffffffa0b7932a>] ? osd_scrub_main+0x59a/0xd00 [osd_ldiskfs]
[<ffffffff81071a6f>] ? release_task+0x33f/0x4b0
[<ffffffff810097cc>] ? __switch_to+0x1ac/0x320
[<ffffffffa0b78d90>] ? osd_scrub_main+0x0/0xd00 [osd_ldiskfs]
[<ffffffff8100c0ca>] ? child_rip+0xa/0x20
[<ffffffffa0b78d90>] ? osd_scrub_main+0x0/0xd00 [osd_ldiskfs]
[<ffffffffa0b78d90>] ? osd_scrub_main+0x0/0xd00 [osd_ldiskfs]
[<ffffffff8100c0c0>] ? child_rip+0x0/0x20

Comment by Johann Lombardi (Inactive) [ 26/Jun/14 ]

For file system lustre4, the situation was improved in the case of UID 657, but not for UID 3018.
UID 657 did not have quota limitation, but "limit" was shown in 'lfs quota'. When I set '-B 1000000000', 'limit' was set to the same value as 'kbytes' on all the OSTs. Then when I set it back to '-B 0', 'limit' was set to '0' (lustre4_lfs.quota_UID654.657_20140626.log). However for UID 3018, 'limit' did not change even I did setquota (lustre4_lfs.quota_UID3018_20140626.log).

When setting the limit to 0, the slaves should release all quota space unconditionally (see qsd_calc_adjust()). Could you please enable quota debug on one of the OSS that did not release space, set quota -B 0, wait for a couple of seconds and then collect logs on this OSS?

Comment by Mitsuhiro Nishizawa [ 26/Jun/14 ]

I did the same thing (setquota -B 0) and captured OSS quota debug log.

Comment by Mitsuhiro Nishizawa [ 26/Jun/14 ]

Other logs captured at the same time.

Comment by Johann Lombardi (Inactive) [ 27/Jun/14 ]

I looked at the debug logs for lustre5 and unfortunately, i have traces of neither the setquota request in the MDS logs nor the glimpse in the OSS logs Could you please collect those logs again and use "lctl mark" to mark the beginning and end of the set quota operation?
Also, could you please make sure that all OST have successfully connected to the master and have been resynchronized? This can be done by running the following command on all the OSSs:

# lctl get_param osd*.*.quota_slave.info

Thanks in advance.

Comment by Mitsuhiro Nishizawa [ 28/Jun/14 ]

I checked again quota_slave.info, but connection to master was setup on all the OSTs (lustre5_quota_slave.info.tar.gz).
I also captured the log again with the markers '########## START SETQUOTA ##########' and '########## END SETQUOTA ##########'.
I did the following in the interval.

lfs quota -u 652 -v /root/lustre
lfs setquota -u 652 -B 1000000000 /root/lustre
lfs quota -u 652 -v /root/lustre
lfs setquota -u 652 -B 0 /root/lustre
lfs quota -u 652 -v /root/lustre
Comment by Zhenyu Xu [ 01/Jul/14 ]

The lustre5_quota_slave.info shows that all OSTs do not have quota enabled.

nmds05_quota_slave.info:5:quota enabled: none
noss37_quota_slave.info:5:quota enabled: none
noss37_quota_slave.info:14:quota enabled: none
noss37_quota_slave.info:23:quota enabled: none
noss37_quota_slave.info:32:quota enabled: none
noss37_quota_slave.info:41:quota enabled: none
noss38_quota_slave.info:5:quota enabled: none
noss38_quota_slave.info:14:quota enabled: none
....

You need enable quota for the test and collect the log again.

Comment by Mitsuhiro Nishizawa [ 01/Jul/14 ]

Yes, as I stated before, we disabled quota for interim remedy. We have tried changing quota setting, force_reint while quota was being enabled, but it did not work. Since we could not determine if quota exceeded on some OSTs are really effective or false positive, we ended up to disable quota and put the system back in service.
Can you please check if the quota exceeded mark "*" shown in some OSTs are really effective, or false and users can write to that OST exceeding the limit. If it is safe, we can enable quota again and capture the log.

When I changed quota setting by 'lfs setquota' while quota is being disabled, OSTs other than OST0000-OST0004 changed 'limit' value in 'lfs quota'. What is the difference between these OSTs and others where 'limit' did not changed?

Can I also ask what is the designed (expected behavior) here at the first place and what is not. When we change quota setting while quota is disabled, what value 'limit' reported by 'lfs quota' should be set to? Does the behavior change when we enable quota? How can we confirm if quota has up-to-dated correct information?

Comment by Johann Lombardi (Inactive) [ 01/Jul/14 ]

Yes, as I stated before, we disabled quota for interim remedy. We have tried changing quota setting, force_reint while quota was being enabled, but it did not work. Since we could not determine if quota exceeded on some OSTs are really effective or false positive, we ended up to disable quota and put the system back in service.

Understood. That said, the procedure we gave you (i.e set quota limit to 0 to force all OSTs to release space) only works if quota has been enabled. Would it be possible to enable quota for a short amount of time and rerun those commands? Once completed, you can then disable quota again if it does not work.

Can you please check if the quota exceeded mark "*" shown in some OSTs are really effective, or false and users can write to that OST exceeding the limit.

It is effective.

If it is safe, we can enable quota again and capture the log.

The thing is that setting the hard limit to 0 is expected to fix the problem (provided that quotas are on), so no log capture would be required in this case.

When I changed quota setting by 'lfs setquota' while quota is being disabled, OSTs other than OST0000-OST0004 changed 'limit' value in 'lfs quota'.

right.

What is the difference between these OSTs and others where 'limit' did not changed?

The OSTs that changed limits still hold a global quota lock, while OST0000-OST0004 do not. This quota lock isn't automatically dropped once quota is disabled and isn't enqueued if quota is disabled at mount time.

Can I also ask what is the designed (expected behavior) here at the first place and what is not. When we change quota setting while quota is disabled, what value 'limit' reported by 'lfs quota' should be set to?

The quota master should report the right limit. As for slaves, it depends on whether they still hold a global quota lock. To sum up, when quota is disabled, you should not pay attention to the hard limit on OSTs.

Does the behavior change when we enable quota?

Yes. When quota is enabled, all quota slaves have to acquire a global quota lock.

How can we confirm if quota has up-to-dated correct information?

In quota_slave.info, the number between [] is 1 when the slave is synchronized with the master and 0 otherwise.

Comment by Johann Lombardi (Inactive) [ 02/Jul/14 ]

Any update?

Comment by Mitsuhiro Nishizawa [ 03/Jul/14 ]

I talked with the customer, but unfortunately just enabling quota while the system is in service is not acceptable for the customer. To do the test safely, setting '-B 0' on all the user and then enable quota would be one option for them. Assuming we do the test, what is expected result?

Currently,
1. Sum of usage on each OST does not match the whole (reported in 3rd line of 'lfs quota')
2. 'limit' in 'lfs quota' is set to (compared to the usage);

  • much greater value (e.g. usage is 0, limit is 38205292)
  • a bit greater value (e.g. usage is 337530440. limit is 404591056)
  • same value, zero (e.g. usage is 0, limit is also 0)
  • same value, but non-zero (e.g. usage is 4, limit is also 4), quota exceeded is set.
  • much smaller value (e.g. usage is 1387420, limit is 4), quota exceeded is set.
  • a bit smaller value (e.g. usage is 405435848. limit is 405418036), quota exceeded is set.
    3. For some users, 'limit' does not change even if we re-set the hard limit (these OSTs has 1 between [] in quota_slave.info).
    4. In quota_slave.info, only OST0000-OST0004 does not have 1 between [].

What is the expected result (what it should be) for these items after setting quota limit to 0 to force all OSTs to release space?

Comment by Niu Yawei (Inactive) [ 03/Jul/14 ]

I talked with the customer, but unfortunately just enabling quota while the system is in service is not acceptable for the customer. To do the test safely, setting '-B 0' on all the user and then enable quota would be one option for them. Assuming we do the test, what is expected result?

The expected result is that all OSTs will release their limit to master (which means all OSTs will have 0 limit at the end). Please collect debug log with D_QUOTA enabled when you enable quota (on both MDT and OSTs). Thank you.

Comment by Mitsuhiro Nishizawa [ 03/Jul/14 ]

We need to build a test plan for the action and so let me ask more.
What is expected after we set '-B 0' and then to the current setting again? As OSTs other than OST0000-OST0004 are synchronized with the master, can we expect the following on those synchronized OSTs?
1. same value in sum of usage on each OST and the whole (reported in 3rd line of 'lfs quota'), on file systems that do not have un-synchronized OSTs?
2. a bit greater value in 'limit'?
3. 'limit' will be updated with zero, and then to an adjusted value?
4. OST0000-OST0004 will be synchronized with the master? Or it will not even after we set '-B 0' on all OSTs?
regards,

Comment by Johann Lombardi (Inactive) [ 03/Jul/14 ]

We need to build a test plan for the action and so let me ask more.

sure

What is expected after we set '-B 0' and then to the current setting again?

On -B 0, all slaves should release reserved quota space.

As OSTs other than OST0000-OST0004 are synchronized with the master,

None of the OSTs are strictly synchronized with the master. If you check quota_slave.info, you will see that all report "glb[0] slv[0]". Some of them just happen to still own a glb quota lock.

can we expect the following on those synchronized OSTs?
1. same value in sum of usage on each OST and the whole (reported in 3rd line of 'lfs quota'), on file systems that do not have un-synchronized OSTs?

Usage should always be consistent, regardless of the quota enforcement status.

2. a bit greater value in 'limit'?

Those ones are expected to release all the quota space.

3. 'limit' will be updated with zero, and then to an adjusted value?

Once quota is enabled, yes.

4. OST0000-OST0004 will be synchronized with the master? Or it will not even after we set '-B 0' on all OSTs?

OST0000-OST0004 will synchronize with the master only once quota is enabled.

At this point, i would advise to proceed as follows:
1. keep quota disabled
2. dump & back up all quota (hard/soft/time) limits on the qmt. To do so, you can run the following command on the MDT and store the results:

# lctl get_param qmt.*.*.glb*

3. set all the limits to 0 with quota disabled
4. enable quota again
5. check that all slaves are properly synchronized (should be "glb[1] slv[1]" everywhere in quota_slave.info)
6. check with lfs quota that all slaves have released quota space
7. set the user/group limit one by one and check that none of the OSTs get crazy limits again

What do you think?

Comment by Mitsuhiro Nishizawa [ 03/Jul/14 ]

Thanks Johann! I will write up a test plan and try the actions you provided. regards,

Comment by Johann Lombardi (Inactive) [ 03/Jul/14 ]

For the record, i created LU-5293 for the problem related to quota files removal on slaves.

Comment by Mitsuhiro Nishizawa [ 07/Jul/14 ]

The customer agreed with our plan and I am doing the work currently. After step 6, I set quota (-B 1000000000) on two users. The 'limit' value has now "0" for OSTs that does not have any object for the user, and 'limit' is set to the same value as usage in the case of the OSTs that has objects. For example,

[root@nmds04 ~]# lfs setquota -u lect01 -B 1000000000 /root/lustre/
[root@nmds04 ~]# 
[root@nmds04 ~]# lfs quota -u lect01 -v /root/lustre/
Disk quotas for user lect01 (uid 3017):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
  /root/lustre/      28       0 1000000000       -       7       0       0       -
lustre4-MDT0000_UUID
                      8       -       0       -       7       -       0       -
lustre4-OST0000_UUID
                      0       -       0       -       -       -       -       -
lustre4-OST0001_UUID
                      0       -       0       -       -       -       -       -
lustre4-OST0002_UUID
                      0       -       0       -       -       -       -       -
lustre4-OST0003_UUID
                      0       -       0       -       -       -       -       -
lustre4-OST0004_UUID
                      4*      -       4       -       -       -       -       -
lustre4-OST0005_UUID
                      4*      -       4       -       -       -       -       -
lustre4-OST0006_UUID
                      0       -       0       -       -       -       -       -
lustre4-OST0007_UUID
                      0       -       0       -       -       -       -       -
lustre4-OST0008_UUID
                      0       -       0       -       -       -       -       -
 [...]

On another user,

[root@nmds04 ~]# lfs setquota -u lect02 -B 1000000000 /root/lustre/
[root@nmds04 ~]# 
[root@nmds04 ~]# lfs quota -u lect02 -v /root/lustre/
Disk quotas for user lect02 (uid 3018):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
  /root/lustre/ 1270432       0 1000000000       -       6       0       0       -
lustre4-MDT0000_UUID
                      8       -       0       -       6       -       0       -
lustre4-OST0000_UUID
                  14344*      -   14344       -       -       -       -       -
lustre4-OST0001_UUID
                  15368*      -   15368       -       -       -       -       -
lustre4-OST0002_UUID
                  15368*      -   15368       -       -       -       -       -
lustre4-OST0003_UUID
                  15372*      -   15372       -       -       -       -       -
lustre4-OST0004_UUID
                  15368*      -   15368       -       -       -       -       -
 [...]

Is this correct behavior?

Comment by Niu Yawei (Inactive) [ 07/Jul/14 ]

Is this correct behavior?

This is correct behavior.

Comment by Mitsuhiro Nishizawa [ 07/Jul/14 ]

I performed the action plan and confirmed quota issue has been resolved on most users, but also found a behavior that quota limit on each OST were not updated even I did 'setquota -B 0' (while quota was enabled, quota_slave was sync with the master). On the same file system, in the case of another user, limit value was updated just after I issued 'setquota -B 0'. Although I could not try on the all users, at least the behavior occurred on one user (UID: 2165). I captured debug log (lustre5_quota_UID2165_20140707.tar.gz). Even though "quota exceeded" is set on some of OSTs, I could write to that OSTs successfully. When I was doing write test on that user, finally limit value was updated just after 'setquota -B 0'. What I was doing is writing files on each OSTs, 'setquota -B 1000000000' and 'setquota -B 0' several times. Is this correct behavior?

Comment by Johann Lombardi (Inactive) [ 07/Jul/14 ]

Could you please clarify when the quota logs were collected? Just after setting the limit to 0? I only have the lfs quota output with a limit set to 10000000

[root@nmds06 ~]# lfs quota -u w3ganglia -v /root/lustre
Disk quotas for user w3ganglia (uid 2165):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
   /root/lustre 1151068       0 10000000       -      25       0       0       -
lustre5-MDT0000_UUID
...
Comment by Johann Lombardi (Inactive) [ 07/Jul/14 ]

In the logs, i see the glimpse related to set quota -B 1000000000:

00040000:04000000:1.0:1404731522.062728:0:24988:0:(qsd_lock.c:238:qsd_glb_glimpse_ast()) lustre5-OST0002: glimpse on glb quota locks, id:2165 ver:569 hard:1000000000 soft:0

But i don't have logs related to -B 0 where the slave is supposed to release all the quota space.

When a hard limit is enforced for a given user/group, it is actually expected to have a local slave limit higher than the actual usage as long as the sum of all slave limits (i.e. as reported by lfs quota -v) is equal to the granted field in qmt_dt-0x0_glb-usr on the quota master. When i look at this file in the tarball, i see:

- id:      2165
  limits:  { hard:              1000000, soft:                    0, granted:               863512, time:                    0 }

It is pretty difficult for me to make any sense of those data because i have the "lfs quota -v" output for -B 10000000, debug logs for -B 1000000000 and qmt_dt-0x0_glb-usr file with -B 1000000. Could you please set the hard limit to 0, collect "lfs quota -v" as well as qmt_dt-0x0_glb-usr and then set the hard limit to 1000000000 and collect again "lfs quota -v" and qmt_dt-0x0_glb-usr ?

Thanks in advance.

Comment by Johann Lombardi (Inactive) [ 07/Jul/14 ]

Ah, i actually found logs later related to -B 0:

1. The notification related to the hard limit change is received by the slave:

00040000:04000000:19.0:1404731522.120841:0:13913:0:(qsd_lock.c:238:qsd_glb_glimpse_ast()) lustre5-OST0002: glimpse on glb quota locks, id:2165 ver:570 hard:0 soft:0

2. The slave updates the lqe as well as the local copy of the global index:

00040000:04000000:0.0:1404731522.120881:0:26196:0:(qsd_entry.c:334:qsd_update_lqe()) $$$ updating global index hardlimit: 0, softlimit: 0 qsd:lustre5-OST0002 qtype:usr id:2165 enforced:0 granted:4 pending:0 waiting:0 req:0 usage:4 qunit:0 qtune:0 edquot:0

3. The slave refreshes usage for this ID:

00040000:04000000:0.0:1404731522.120890:0:26196:0:(qsd_entry.c:219:qsd_refresh_usage()) $$$ disk usage: 4 qsd:lustre5-OST0002 qtype:usr id:2165 enforced:0 granted:4 pending:0 waiting:0 req:0 usage:4 qunit:0 qtune:0 edquot:0

4. The slave decides to release the 4KB of quota space it owns for this user since quota isn't enforced any more for this ID:

00040000:04000000:0.0:1404731522.120898:0:26196:0:(qsd_handler.c:179:qsd_calc_adjust()) $$$ not enforced, releasing all space qsd:lustre5-OST0002 qtype:usr id:2165 enforced:0 granted:4 pending:0 waiting:0 req:0 usage:4 qunit:0 qtune:0 edquot:0
00040000:04000000:16.0:1404731522.121264:0:60574:0:(qsd_handler.c:335:qsd_req_completion()) $$$ DQACQ returned 0, flags:0x4 qsd:lustre5-OST0002 qtype:usr id:2165 enforced:0 granted:4 pending:0 waiting:0 req:1 usage:4 qunit:0 qtune:0 edquot:0
00040000:04000000:16.0:1404731522.121266:0:60574:0:(qsd_handler.c:357:qsd_req_completion()) $$$ DQACQ qb_count:4 qsd:lustre5-OST0002 qtype:usr id:2165 enforced:0 granted:4 pending:0 waiting:0 req:1 usage:4 qunit:0 qtune:0 edquot:0
00040000:04000000:0.0:1404731522.121283:0:26196:0:(qsd_entry.c:278:qsd_update_index()) lustre5-OST0002: update granted to 0 for id 2165

As far as i can see, -B 0 worked as expected and all the quota space owned by the slave has been released. If you could just collect qmt_dt-0x0_glb-usr as well as "lfs quota -v" output, then we could check that limits are consistent everywhere.

Comment by Mitsuhiro Nishizawa [ 08/Jul/14 ]

When I set -B 0 for UID2165, quota limit as a whole was updated to the 0, but 'limit' on each OSTs was not updated. qmt_dt-0x0_glb-usr just after setting -B 0 was;

- id:      2165
  limits:  { hard:                    0, soft:                    0, granted:                    0, time:                    0 }

However, when I ran lfs quota;

Disk quotas for user 2165 (uid 2165):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
   /root/lustre  512084       0       0       -      17       0       0       -
lustre5-MDT0000_UUID
                      8       -       0       -      17       -       0       -
lustre5-OST0000_UUID
                      0       -  270728       -       -       -       -       -
lustre5-OST0001_UUID
                      0       -   95220       -       -       -       -       -
lustre5-OST0002_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0003_UUID
                      0       -   42292       -       -       -       -       -
lustre5-OST0004_UUID
                      0       -  798452       -       -       -       -       -
lustre5-OST0005_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0006_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0007_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0008_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0009_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000a_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000b_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST000c_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000d_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000e_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000f_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0010_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0011_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0012_UUID
                 102408       -       0       -       -       -       -       -
lustre5-OST0013_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0014_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0015_UUID
                      0       -  105836       -       -       -       -       -
lustre5-OST0016_UUID
                      0       -   59688       -       -       -       -       -
lustre5-OST0017_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0018_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0019_UUID
                      0       - 1983424       -       -       -       -       -
lustre5-OST001a_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST001b_UUID
                 102408       -       0       -       -       -       -       -
lustre5-OST001c_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST001d_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST001e_UUID
                 102408       -       0       -       -       -       -       -
lustre5-OST001f_UUID
                 102408       -       0       -       -       -       -       -
lustre5-OST0020_UUID
                      0       - 1522108       -       -       -       -       -
lustre5-OST0021_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0022_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0023_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0024_UUID
                      0       -  796168       -       -       -       -       -
lustre5-OST0025_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0026_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0027_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0028_UUID
                      0       -  648440       -       -       -       -       -
lustre5-OST0029_UUID
                      0       -   55360       -       -       -       -       -
lustre5-OST002a_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST002b_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST002c_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST002d_UUID
                      0       -   60344       -       -       -       -       -
lustre5-OST002e_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST002f_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0030_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0031_UUID
                      0       -   18940       -       -       -       -       -
lustre5-OST0032_UUID
                      0       -   63884       -       -       -       -       -
lustre5-OST0033_UUID
                      0       -   19500       -       -       -       -       -
lustre5-OST0034_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0035_UUID
                      0       - 1138508       -       -       -       -       -
lustre5-OST0036_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0037_UUID
                      0       -  445992       -       -       -       -       -
lustre5-OST0038_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0039_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST003a_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST003b_UUID
                      0       - 1083572       -       -       -       -       -
lustre5-OST003c_UUID
                      0       - 1015720       -       -       -       -       -
lustre5-OST003d_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST003e_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST003f_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0040_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0041_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0042_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0043_UUID
                      0       -   80396       -       -       -       -       -
lustre5-OST0044_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0045_UUID
                      0       -   68900       -       -       -       -       -
lustre5-OST0046_UUID
                      0       -   74876       -       -       -       -       -
lustre5-OST0047_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0048_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0049_UUID
                 102408       -       0       -       -       -       -       -
lustre5-OST004a_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST004b_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004c_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004d_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004e_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004f_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0050_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0051_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0052_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0053_UUID
                      0       - 1604368       -       -       -       -       -

Why those 'limit' were not updated to 0?

Comment by Johann Lombardi (Inactive) [ 08/Jul/14 ]

Well, i looked at logs for lustre5-OST0002 which released all the quota space (limit = 0). I am going to have a look at the logs for the other OSTs then.

Comment by Johann Lombardi (Inactive) [ 08/Jul/14 ]

Checking logs for lustre5-OST0004:

00040000:04000000:1.0:1404731522.120827:0:24988:0:(qsd_lock.c:238:qsd_glb_glimpse_ast()) lustre5-OST0004: glimpse on glb quota locks, id:2165 ver:570 hard:0 soft:0
00040000:04000000:27.0:1404731522.120894:0:26375:0:(qsd_entry.c:334:qsd_update_lqe()) $$$ updating global index hardlimit: 0, softlimit: 0 qsd:lustre5-OST0004 qtype:usr id:2165 enforced:0 granted:0 pending:0 waiting:0 req:0 usage:0 qunit:0 qtune:0 edquot:0
00040000:04000000:27.0:1404731522.120936:0:26375:0:(qsd_entry.c:219:qsd_refresh_usage()) $$$ disk usage: 0 qsd:lustre5-OST0004 qtype:usr id:2165 enforced:0 granted:0 pending:0 waiting:0 req:0 usage:0 qunit:0 qtune:0 edquot:0
00040000:04000000:27.0:1404731522.120946:0:26375:0:(qsd_handler.c:931:qsd_adjust()) $$$ no adjustment required qsd:lustre5-OST0004 qtype:usr id:2165 enforced:0 granted:0 pending:0 waiting:0 req:0 usage:0 qunit:0 qtune:0 edquot:0

So it seems that no quota space was owned by lustre5-OST0004 and everything looks fine in the logs. Unfortunately, the logs were not collected at the same time as the lfs quota output above. Could you please:
1. set hard limit to a high value and wait for 5s
2. enable +quota and +trace in the debug mask of both MDS and OSS where lustre5-OST0004 is running.
3. run lctl mark start on MDS & OSS
4. set hard limit to 0 and wait for 5s
5. run lctl mark end on MDS & OSS
6. run lfs quota -v
7. collect lustre logs on both MDS & OSS

Thanks in advance.

Comment by Mitsuhiro Nishizawa [ 08/Jul/14 ]

Johann, I collected 'lfs quota -v' output just after 'setquota -B 0'. What I did was,
1. enable +quota and +trace in the debug mask of both MDS and OSS where lustre5-OST0004 is running.
2. run lctl mark start on MDS & OSS
3. run lfs quota -v
4. set hard limit to 1000000000
5. run lfs quota -v
6. set hard limit to 0
7. run lfs quota -v
8. wait for 5s
9. run lctl mark end on MDS & OSS
10. collect lustre logs on both MDS & OSS
The output above was captured in step 7. I also ran lfs quota -v a while (a few minutes or so) after that, and found the same output. Since this was the only user that showed this synopsis and now it was resolved somehow, I cannot capture the additional log. Is there a chance that this is a reporting issue in lfs utility?

Comment by Johann Lombardi (Inactive) [ 08/Jul/14 ]

Mitsuhiro, could you please clarify what you mean by "and now it was resolved somehow"?

Comment by Mitsuhiro Nishizawa [ 08/Jul/14 ]

After I captured lustre debug log, the customer was doing a write test using UID2165 to see how it behaves. At that time, 'limit' on each OSTs did not changed by setting hard limit to 0 or to 1000000000. They found they can write over the 'limit' shown in 'lfs quota'. While doing the test and 'lfs setquota', they also noticed 'limit' on each OSTs changed by 'lfs setquota'. Now, it can be set to "0" by 'lfs setquota -B 0' and to a value like "69652" by 'lfs setquota -B 10000000" (in this case, hard limit is set to 10GB). Thanks,

Comment by Mitsuhiro Nishizawa [ 11/Jul/14 ]

What does the behavior on UID2165 mean? Is there anything we should do to have quota behave correctly? Regards,

Comment by Niu Yawei (Inactive) [ 11/Jul/14 ]

the customer was doing a write test using UID2165 to see how it behaves. At that time, 'limit' on each OSTs did not changed by setting hard limit to 0 or to 1000000000.

Do you mean that customer is writing files as UID2165 while you changing hard limit? How did you observe that limit wasn't changed by setting hard limit to 0 (could you show me the command and the output)?

They found they can write over the 'limit' shown in 'lfs quota'.

Could you explain it in detail? One possible reason is: the limit was set to 0, data could be cached on client, when user set a limit, the cache data will be flushed back anyway despite of quota limit.

While doing the test and 'lfs setquota', they also noticed 'limit' on each OSTs changed by 'lfs setquota'.

Was the limit changed as we have expected?

Now, it can be set to "0" by 'lfs setquota -B 0' and to a value like "69652" by 'lfs setquota -B 10000000" (in this case, hard limit is set to 10GB)

You mean that UID is back to normal now, right?

Comment by Mitsuhiro Nishizawa [ 11/Jul/14 ]

Do you mean that customer is writing files as UID2165 while you changing hard limit? How did you observe that limit wasn't changed by setting hard limit to 0 (could you show me the command and the output)?

No. I set hard limit to 0 and confirmed 'limit' did not changed. After that, the customer did write tests and found 'limit' started to change. Here is the log,

ESC]0;root@nmds06:~^G[root@nmds06 ~]# lfs setquota -u w3ganglia -B ESC[1@1ESC[CESC[CESC[ESC[1P0000000ESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[C
ESC]0;root@nmds06:~^G[root@nmds06 ~]# lfs quota ESC[K-u w3ganglia /root/lustre/
Disk quotas for user w3ganglia (uid 2165):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
  /root/lustre/  512084       0 1000000000       -      17       0       0       -
ESC]0;root@nmds06:~^G[root@nmds06 ~]# 
ESC]0;root@nmds06:~^G[root@nmds06 ~]# lfs quota -u w3ganglia -v 
Disk quotas for user w3ganglia (uid 2165):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
  /root/lustre/  512084       0 1000000000       -      17       0       0       -
lustre5-MDT0000_UUID
                      8       -       0       -      17       -       0       -
lustre5-OST0000_UUID
                      0       -  270728       -       -       -       -       -
lustre5-OST0001_UUID
                      0       -   95220       -       -       -       -       -
lustre5-OST0002_UUID
                      4*      -       4       -       -       -       -       -
lustre5-OST0003_UUID
                      0       -   42292       -       -       -       -       -
lustre5-OST0004_UUID
                      0       -  798452       -       -       -       -       -
lustre5-OST0005_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0006_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0007_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0008_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0009_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000a_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000b_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST000c_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000d_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000e_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000f_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0010_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0011_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0012_UUID
                 102408       - 4194304       -       -       -       -       -
lustre5-OST0013_UUID
                      4*      -       4       -       -       -       -       -
lustre5-OST0014_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0015_UUID
                      0       -  105836       -       -       -       -       -
lustre5-OST0016_UUID
                      0       -   59688       -       -       -       -       -
lustre5-OST0017_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0018_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0019_UUID
                      0       - 1983424       -       -       -       -       -
lustre5-OST001a_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST001b_UUID
                 102408       - 4194304       -       -       -       -       -
lustre5-OST001c_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST001d_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST001e_UUID
                 102408*      -  102408       -       -       -       -       -
lustre5-OST001f_UUID
                 102408*      -  102408       -       -       -       -       -
lustre5-OST0020_UUID
                      0       - 1522108       -       -       -       -       -
lustre5-OST0021_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0022_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0023_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0024_UUID
                      0       -  796168       -       -       -       -       -
lustre5-OST0025_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0026_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0027_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0028_UUID
                      0       -  648440       -       -       -       -       -
lustre5-OST0029_UUID
                      0       -   55360       -       -       -       -       -
lustre5-OST002a_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST002b_UUID
                      4*      -       4       -       -       -       -       -
lustre5-OST002c_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST002d_UUID
                      0       -   60344       -       -       -       -       -
lustre5-OST002e_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST002f_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0030_UUID
                      4*      -       4       -       -       -       -       -
lustre5-OST0031_UUID
                      0       -   18940       -       -       -       -       -
lustre5-OST0032_UUID
                      0       -   63884       -       -       -       -       -
lustre5-OST0033_UUID
                      0       -   19500       -       -       -       -       -
lustre5-OST0034_UUID
                      4*      -       4       -       -       -       -       -
lustre5-OST0035_UUID
                      0       - 1138508       -       -       -       -       -
lustre5-OST0036_UUID
                      4*      -       4       -       -       -       -       -
lustre5-OST0037_UUID
                      0       -  445992       -       -       -       -       -
lustre5-OST0038_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0039_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST003a_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST003b_UUID
                      0       - 1083572       -       -       -       -       -
lustre5-OST003c_UUID
                      0       - 1015720       -       -       -       -       -
lustre5-OST003d_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST003e_UUID
                      4       - 4194304       -       -       -       -       -
lustre5-OST003f_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0040_UUID
                      4*      -       4       -       -       -       -       -
lustre5-OST0041_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0042_UUID
                      4*      -       4       -       -       -       -       -
lustre5-OST0043_UUID
                      0       -   80396       -       -       -       -       -
lustre5-OST0044_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0045_UUID
                      0       -   68900       -       -       -       -       -
lustre5-OST0046_UUID
                      0       -   74876       -       -       -       -       -
lustre5-OST0047_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0048_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0049_UUID
                 102408*      -  102408       -       -       -       -       -
lustre5-OST004a_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST004b_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004c_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004d_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004e_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004f_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0050_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0051_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0052_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0053_UUID
                      0       - 1604368       -       -       -       -       -
ESC]0;root@nmds06:~^G[root@nmds06 ~]# 
ESC]0;root@nmds06:~^G[root@nmds06 ~]# 
ESC]0;root@nmds06:~^G[root@nmds06 ~]# 
ESC]0;root@nmds06:~^G[root@nmds06 ~]# lfs quota -u w3ganglia ESC[3P/root/lustre/^MESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[17@setquota -u w3ganglia -B 1000000000ESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H0 ESC[K/root/lustreESC[K
ESC]0;root@nmds06:~^G[root@nmds06 ~]# 
ESC]0;root@nmds06:~^G[root@nmds06 ~]# lfs setquota -u w3ganglia -B 0 /root/lustre^MESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[4Pquota -u w3ganglia -v /root/lustre/
Disk quotas for user w3ganglia (uid 2165):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
  /root/lustre/  512084       0       0       -      17       0       0       -
lustre5-MDT0000_UUID
                      8       -       0       -      17       -       0       -
lustre5-OST0000_UUID
                      0       -  270728       -       -       -       -       -
lustre5-OST0001_UUID
                      0       -   95220       -       -       -       -       -
lustre5-OST0002_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0003_UUID
                      0       -   42292       -       -       -       -       -
lustre5-OST0004_UUID
                      0       -  798452       -       -       -       -       -
lustre5-OST0005_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0006_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0007_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0008_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0009_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000a_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000b_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST000c_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000d_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000e_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST000f_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0010_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0011_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0012_UUID
                 102408       -       0       -       -       -       -       -
lustre5-OST0013_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0014_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0015_UUID
                      0       -  105836       -       -       -       -       -
lustre5-OST0016_UUID
                      0       -   59688       -       -       -       -       -
lustre5-OST0017_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0018_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0019_UUID
                      0       - 1983424       -       -       -       -       -
lustre5-OST001a_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST001b_UUID
                 102408       -       0       -       -       -       -       -
lustre5-OST001c_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST001d_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST001e_UUID
                 102408       -       0       -       -       -       -       -
lustre5-OST001f_UUID
                 102408       -       0       -       -       -       -       -
lustre5-OST0020_UUID
                      0       - 1522108       -       -       -       -       -
lustre5-OST0021_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0022_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0023_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0024_UUID
                      0       -  796168       -       -       -       -       -
lustre5-OST0025_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0026_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0027_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0028_UUID
                      0       -  648440       -       -       -       -       -
lustre5-OST0029_UUID
                      0       -   55360       -       -       -       -       -
lustre5-OST002a_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST002b_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST002c_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST002d_UUID
                      0       -   60344       -       -       -       -       -
lustre5-OST002e_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST002f_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0030_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0031_UUID
                      0       -   18940       -       -       -       -       -
lustre5-OST0032_UUID
                      0       -   63884       -       -       -       -       -
lustre5-OST0033_UUID
                      0       -   19500       -       -       -       -       -
lustre5-OST0034_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0035_UUID
                      0       - 1138508       -       -       -       -       -
lustre5-OST0036_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0037_UUID
                      0       -  445992       -       -       -       -       -
lustre5-OST0038_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0039_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST003a_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST003b_UUID
                      0       - 1083572       -       -       -       -       -
lustre5-OST003c_UUID
                      0       - 1015720       -       -       -       -       -
lustre5-OST003d_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST003e_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST003f_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0040_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0041_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0042_UUID
                      4       -       0       -       -       -       -       -
lustre5-OST0043_UUID
                      0       -   80396       -       -       -       -       -
lustre5-OST0044_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0045_UUID
                      0       -   68900       -       -       -       -       -
lustre5-OST0046_UUID
                      0       -   74876       -       -       -       -       -
lustre5-OST0047_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0048_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0049_UUID
                 102408       -       0       -       -       -       -       -
lustre5-OST004a_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST004b_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004c_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004d_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004e_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST004f_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0050_UUID
                      0       -       0       -       -       -       -       -
lustre5-OST0051_UUID
                      0       - 4194304       -       -       -       -       -
lustre5-OST0052_UUID
                      0       -       4       -       -       -       -       -
lustre5-OST0053_UUID
                      0       - 1604368       -       -       -       -       -

and finally it was,

[CESC[CESC[CESC[CESC[ClsESC[fs quota -u w3ganglia -v /root/lustre^MESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CsESC[fs quota -u w3ganglia -v /root/lustre^MESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CESC[CsESC[fs quota -u w3ganglia -v /root/lustre
Disk quotas for user w3ganglia (uid 2165):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
   /root/lustre 1151068       0 10000000       -      25       0       0       -
lustre5-MDT0000_UUID
                     32       -       0       -      25       -       0       -
lustre5-OST0000_UUID
                   4100       -   68616       -       -       -       -       -
lustre5-OST0001_UUID
                   6152       -   70672       -       -       -       -       -
lustre5-OST0002_UUID
                   5128       -   69652       -       -       -       -       -
lustre5-OST0003_UUID
                   5124       -   68616       -       -       -       -       -
lustre5-OST0004_UUID
                   4100       -   68616       -       -       -       -       -
lustre5-OST0005_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST0006_UUID
                   7188       -   71720       -       -       -       -       -
lustre5-OST0007_UUID
                   4112       -   68632       -       -       -       -       -
lustre5-OST0008_UUID
                   6160       -   69648       -       -       -       -       -
lustre5-OST0009_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST000a_UUID
                   6160       -   70688       -       -       -       -       -
lustre5-OST000b_UUID
                   6164       -   70696       -       -       -       -       -
lustre5-OST000c_UUID
                   5136       -   68624       -       -       -       -       -
lustre5-OST000d_UUID
                   5136       -   69648       -       -       -       -       -
lustre5-OST000e_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST000f_UUID
                   7188       -   71720       -       -       -       -       -
lustre5-OST0010_UUID
                   4112       -   68632       -       -       -       -       -
lustre5-OST0011_UUID
                   6160       -   69648       -       -       -       -       -
lustre5-OST0012_UUID
                 106520       -  171032       -       -       -       -       -
lustre5-OST0013_UUID
                   6168       -   70692       -       -       -       -       -
lustre5-OST0014_UUID
                   6164       -   70696       -       -       -       -       -
lustre5-OST0015_UUID
                   5136       -   68624       -       -       -       -       -
lustre5-OST0016_UUID
                   5136       -   69648       -       -       -       -       -
lustre5-OST0017_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST0018_UUID
                   7188       -   71720       -       -       -       -       -
lustre5-OST0019_UUID
                   4112       -   68632       -       -       -       -       -
lustre5-OST001a_UUID
                   6160       -   69648       -       -       -       -       -
lustre5-OST001b_UUID
                 106520       -  171032       -       -       -       -       -
lustre5-OST001c_UUID
                   4108       -   68624       -       -       -       -       -
lustre5-OST001d_UUID
                   6164       -   70688       -       -       -       -       -
lustre5-OST001e_UUID
                 107544       -  172072       -       -       -       -       -
lustre5-OST001f_UUID
                 108568       -  172056       -       -       -       -       -
lustre5-OST0020_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST0021_UUID
                   5136       -   69648       -       -       -       -       -
lustre5-OST0022_UUID
                   6164       -   70696       -       -       -       -       -
lustre5-OST0023_UUID
                 106516       -  171388       -       -       -       -       -
lustre5-OST0024_UUID
                   6160       -   69648       -       -       -       -       -
lustre5-OST0025_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST0026_UUID
                   6152       -   70672       -       -       -       -       -
lustre5-OST0027_UUID
                   6156       -   70680       -       -       -       -       -
lustre5-OST0028_UUID
                   5124       -   68616       -       -       -       -       -
lustre5-OST0029_UUID
                   5124       -   69640       -       -       -       -       -
lustre5-OST002a_UUID
                   5136       -   69648       -       -       -       -       -
lustre5-OST002b_UUID
                   6168       -   70700       -       -       -       -       -
lustre5-OST002c_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST002d_UUID
                   6160       -   69648       -       -       -       -       -
lustre5-OST002e_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST002f_UUID
                   6164       -   70688       -       -       -       -       -
lustre5-OST0030_UUID
                   6168       -   70700       -       -       -       -       -
lustre5-OST0031_UUID
                   5136       -   68624       -       -       -       -       -
lustre5-OST0032_UUID
                   5136       -   69648       -       -       -       -       -
lustre5-OST0033_UUID
                   5132       -   65536       -       -       -       -       -
lustre5-OST0034_UUID
                   6168       -   70700       -       -       -       -       -
lustre5-OST0035_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST0036_UUID
                   6164       -   69652       -       -       -       -       -
lustre5-OST0037_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST0038_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST0039_UUID
                   6164       -   70688       -       -       -       -       -
lustre5-OST003a_UUID
                 106516       -  150608       -       -       -       -       -
lustre5-OST003b_UUID
                   6160       -   69648       -       -       -       -       -
lustre5-OST003c_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST003d_UUID
                   5136       -   69648       -       -       -       -       -
lustre5-OST003e_UUID
                   6168       -   70692       -       -       -       -       -
lustre5-OST003f_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST0040_UUID
                   5140       -   69652       -       -       -       -       -
lustre5-OST0041_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST0042_UUID
                   5848       -   70700       -       -       -       -       -
lustre5-OST0043_UUID
                   5136       -   68624       -       -       -       -       -
lustre5-OST0044_UUID
                   5136       -   69648       -       -       -       -       -
lustre5-OST0045_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST0046_UUID
                   6164       -   70680       -       -       -       -       -
lustre5-OST0047_UUID
                   6164       -   70696       -       -       -       -       -
lustre5-OST0048_UUID
                   5136       -   68624       -       -       -       -       -
lustre5-OST0049_UUID
                 107544       -  172056       -       -       -       -       -
lustre5-OST004a_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST004b_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST004c_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST004d_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST004e_UUID
                   4112       -   68624       -       -       -       -       -
lustre5-OST004f_UUID
                   6164       -   70680       -       -       -       -       -
lustre5-OST0050_UUID
                   6164       -   70700       -       -       -       -       -
lustre5-OST0051_UUID
                   5136       -   68624       -       -       -       -       -
lustre5-OST0052_UUID
                   5136       -   69648       -       -       -       -       -
lustre5-OST0053_UUID
                   4112       -   68624       -       -       -       -       -

Could you explain it in detail? One possible reason is: the limit was set to 0, data could be cached on client, when user set a limit, the cache data will be flushed back anyway despite of quota limit.

I don't have the log to show details unfortunately as I was not able to do a test by my own... Maybe the possibility you show is true. Thanks for pointing out.

Was the limit changed as we have expected?

Yes, as shown above, 'limit' on each OSTs started to change.

You mean that UID is back to normal now, right?

Yes, right. As far as I checked the account, I don't see a problematic behavior. However, because the customer observed the behavior above by themselves, they have a concern if quota is really behaving correctly now. I could confirm this account, but am not sure for all the account.

Comment by Niu Yawei (Inactive) [ 11/Jul/14 ]

No. I set hard limit to 0 and confirmed 'limit' did not changed. After that, the customer did write tests and found 'limit' started to change. Here is the log

I didn't see global limit was set to 0 (it was from 1000000000 to 10000000) from the log.

Comment by Mitsuhiro Nishizawa [ 11/Jul/14 ]

I didn't see global limit was set to 0 (it was from 1000000000 to 10000000) from the log.

The log contains many meta characters and so it is a bit hard to read, but hard limit was first set to 1000000000, then to 0, finally to 10000000.

Comment by Niu Yawei (Inactive) [ 11/Jul/14 ]

The log contains many meta characters and so it is a bit hard to read, but hard limit was first set to 1000000000, then to 0, finally to 10000000.

Ah, I see it. Did you wait for a while between 'lfs setquota -B 0' and 'lfs quota'?

Comment by Mitsuhiro Nishizawa [ 11/Jul/14 ]

In the log above, it was a few seconds, but this situation lasted for more than 10 minutes. I tried a couple of times during the time.

Comment by Niu Yawei (Inactive) [ 14/Jul/14 ]

In the log above, it was a few seconds, but this situation lasted for more than 10 minutes. I tried a couple of times during the time.

Got it, thank you, but it's hard to tell why this happened without related logs, could you try to collect logs (in the way Johann suggested in previous comment) when you see the problem again? Thanks.

Comment by Mitsuhiro Nishizawa [ 14/Jul/14 ]

The steps Johann suggested was,

1. set hard limit to a high value and wait for 5s
2. enable +quota and +trace in the debug mask of both MDS and OSS where lustre5-OST0004 is running.
3. run lctl mark start on MDS & OSS
4. set hard limit to 0 and wait for 5s
5. run lctl mark end on MDS & OSS
6. run lfs quota -v
7. collect lustre logs on both MDS & OSS

and the log I collected was,

1. enable +quota and +trace in the debug mask of both MDS and OSS where lustre5-OST0004 is running.
2. run lctl mark start on MDS & OSS
3. run lfs quota -v
4. set hard limit to 1000000000
5. run lfs quota -v
6. set hard limit to 0
7. run lfs quota -v
8. wait for 5s
9. run lctl mark end on MDS & OSS
10. collect lustre logs on both MDS & OSS

The difference would be whether we wait for 5 seconds before 'lfs quota' (debug log was captured after the '5 seconds') or do it after waiting for 5 seconds. How does this difference in collecting the log make sense to explain the behavior?

Comment by Niu Yawei (Inactive) [ 14/Jul/14 ]

The difference would be whether we wait for 5 seconds before 'lfs quota' (debug log was captured after the '5 seconds') or do it after waiting for 5 seconds. How does this difference in collecting the log make sense to explain the behavior?

If the 'lfs quota -v' is executed immediately after 'lfs setquota -B 0', the slaves might haven't release limits yet, so you'd wait for a while after setting limit, then run 'lfs quota -v' to verify limit. We checked the provided log, and the log shows that slaves released limits as expected.

Comment by Peter Jones [ 07/Aug/14 ]

As per Ihara it is ok to close this ticket. Quotas are now running ok since they were reset. It is not understood how things got into an inconsistent state but things have been running ok since the reset.

Generated at Sat Feb 10 01:49:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.