[LU-4505] invalid "Disk quota exceed" error Created: 17/Jan/14 Updated: 15/Jul/14 Resolved: 21/Feb/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.6.0, Lustre 2.5.1 |
| Type: | Bug | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | mn4 | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 12323 | ||||
| Description |
|
User is getting "Disk quota Exceeded" but has lots of quota available. Uploaded following file to ftp site. The following was writing to OST00000 stripe dir. — STRACE OUTPUT---- ---------------------------------------------------------- |
| Comments |
| Comment by Peter Jones [ 17/Jan/14 ] |
|
Niu Could you please look into this one? Thanks Peter |
| Comment by Niu Yawei (Inactive) [ 20/Jan/14 ] |
|
What kind of operations did you do before this happen? Was the limit for the kferschw bumped to current value (1100000000) from a smaller one? Thanks. |
| Comment by Mahmoud Hanafi [ 20/Jan/14 ] |
|
The user report this issue. The inability to write has been inconsistent. The user sometime was able to write 25GB file to the filesystem. When the files was deleted he couldn't write a much smaller file. After I took the debug logs I set the user quota to Zero and then set it back. I ran into the same quota issue. Although I didn't change the (1100000000). There are other users who have the same issue with quota on this filesystem. This is the only 2.4.x filesystem we have so far. |
| Comment by Niu Yawei (Inactive) [ 21/Jan/14 ] |
|
hi, Mahmoud Which means problem can be reproduced even when you reset the quota limit? Could you collect debug log on MDT and the OST with D_TRACE & D_QUOTA enabled when you reset limit (set limit to zero then set back)? Thanks. |
| Comment by Mahmoud Hanafi [ 21/Jan/14 ] |
|
I will see I can reproduce it. Did look at the logs I uploaded |
| Comment by Mahmoud Hanafi [ 21/Jan/14 ] |
|
i have uploaded the following debug trace files to the ftp site. |
| Comment by Niu Yawei (Inactive) [ 22/Jan/14 ] |
|
Thank you, Mahmoud. The log shows that only the OST0000 has such problem, I suspect it because the edquot flag on OST0000 was set mistakenly by some sort of race. |
| Comment by Niu Yawei (Inactive) [ 22/Jan/14 ] |
| Comment by Mahmoud Hanafi [ 22/Jan/14 ] |
|
I had create a directory that was only 1 striped fixed on ost00000 to make easier to debug. I think that is why you only saw ost00000. |
| Comment by Jay Lan (Inactive) [ 22/Jan/14 ] |
|
Hi Niu, Is the extra check you removed in 8954 not needed at the first place? Could removing the check open door to a different race condition? |
| Comment by Niu Yawei (Inactive) [ 23/Jan/14 ] |
|
Hi, Jay, I can't think of any other race for now, let's see how patch inspectors think. |
| Comment by Bob Glossman (Inactive) [ 19/Feb/14 ] |
|
backport to b2_5: |
| Comment by Peter Jones [ 21/Feb/14 ] |
|
Landed for 2.5.1 and 2.6 |
| Comment by Niu Yawei (Inactive) [ 15/Jul/14 ] |