[LU-4482] OST grants bugs Created: 14/Jan/14 Updated: 25/Feb/14 Resolved: 25/Feb/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Alexey Lyashkov | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | MB | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 12273 | ||||||||
| Description |
|
Lustre: DEBUG MARKER: == sanity test 63a: Verify oig_wait interruption does not crash ========= 13:06:48 (1389690408) |
| Comments |
| Comment by Oleg Drokin [ 15/Jan/14 ] |
|
I just checked my logs and I frequently see this in case of OST being full. It's probably been thee for a while since I see this all the way back to when my logs started. We need to get to the root of this as this potentially can lead to unexpected data loss on the client side. |
| Comment by Oleg Drokin [ 15/Jan/14 ] |
|
Also it seems to started at around May 25 2013 in my test logs as I now see |
| Comment by Peter Jones [ 15/Jan/14 ] |
|
Niu Could you please look into this one? Thanks Peter |
| Comment by Niu Yawei (Inactive) [ 16/Jan/14 ] |
|
Seems it's introduced by " Xiong, could you take a look at this? I think it's an unintentional change, right? |
| Comment by Jinshan Xiong (Inactive) [ 16/Jan/14 ] |
|
It's changed that way on purpose because I think it doesn't need to consume grant if the application can see the errors with a sync write. If it can cause grant issue, then grant algorithm has BUGs because pages without FROM_GRANT flag shouldn't consume reserved space. |
| Comment by Niu Yawei (Inactive) [ 16/Jan/14 ] |
Any kind of write (include sync write or direct io) should consume grant if the client has available grant (and the FROM_GRANT flag should be set on these pages), otherwise, OST could run of of space with client still holding lots of grant. |
| Comment by Jinshan Xiong (Inactive) [ 16/Jan/14 ] |
|
obviously the issue here is not for ENOSPC. the reserved space is less than granted bytes. Did I miss something? |
| Comment by Niu Yawei (Inactive) [ 16/Jan/14 ] |
|
If sync write doesn't consume grant, the grant hold by client will not decreased on sync write, however, free space on OST will be decreased, at the end, OST space will be used up by sync writes, however, client still hold grants, and further cached data will be lost. The error message shows that available space is less than total granted bytes (which means client has grant, but OST hasn't enough space for grant) it's because sync write doesn't consume grant but consumes space. |
| Comment by Niu Yawei (Inactive) [ 18/Jan/14 ] |
|
Well, there are two problems:
|
| Comment by Niu Yawei (Inactive) [ 18/Jan/14 ] |
| Comment by Niu Yawei (Inactive) [ 25/Feb/14 ] |
|
patch landed on master for 2.6 |