[LU-11288] tgt_grant_sanity_check()) LBUG Created: 28/Aug/18 Updated: 19/Apr/21 Resolved: 29/Oct/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0 |
| Fix Version/s: | Lustre 2.12.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Oleg Drokin | Assignee: | Alex Zhuravlev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
Had this trigger in racer in current master-next, but does not appear to be caused by anything unlanded. [ 2800.273242] LustreError: 20863:0:(tgt_grant.c:151:tgt_check_export_grants()) lustre-OST0002: cli e57bac33-ee31-2bdc-225e-2658736d80ff/ffff880251df1800 ted_grant(1142554624) + ted_pending(0) > maxsize(250609664) [ 2800.312893] LustreError: 20863:0:(tgt_grant.c:223:tgt_grant_sanity_check()) LBUG [ 2800.319055] Pid: 20863, comm: ll_ost_create07 3.10.0-7.5-debug #1 SMP Sun Jun 3 13:35:38 EDT 2018 [ 2800.321344] Call Trace: [ 2800.322479] [<ffffffffa01cd7dc>] libcfs_call_trace+0x8c/0xc0 [libcfs] [ 2800.327468] [<ffffffffa01cd88c>] lbug_with_loc+0x4c/0xa0 [libcfs] [ 2800.328911] [<ffffffffa0bbbf7c>] tgt_grant_sanity_check+0x51c/0x550 [ptlrpc] [ 2800.333566] [<ffffffffa127dc14>] ofd_statfs+0x104/0x480 [ofd] [ 2800.334982] [<ffffffffa12712e0>] ofd_statfs_hdl+0x70/0x280 [ofd] [ 2800.335938] LustreError: 3246:0:(tgt_grant.c:151:tgt_check_export_grants()) lustre-OST0002: cli e57bac33-ee31-2bdc-225e-2658736d80ff/ffff880251df1800 ted_grant(1142554624) + ted_pending(0) > maxsize(250609664) [ 2800.335961] LustreError: 3246:0:(tgt_grant.c:223:tgt_grant_sanity_check()) LBUG [ 2800.344690] [<ffffffffa0ba0705>] tgt_request_handle+0xaf5/0x1590 [ptlrpc] [ 2800.348268] [<ffffffffa0b44e26>] ptlrpc_server_handle_request+0x256/0xad0 [ptlrpc] [ 2800.350975] [<ffffffffa0b48c1e>] ptlrpc_main+0xabe/0x1f80 [ptlrpc] [ 2800.352543] [<ffffffff810ae864>] kthread+0xe4/0xf0 [ 2800.354434] [<ffffffff81783777>] ret_from_fork_nospec_end+0x0/0x39 [ 2800.356861] [<ffffffffffffffff>] 0xffffffffffffffff [ 2800.360929] Kernel panic - not syncing: LBUG [ 2800.360945] Pid: 3246, comm: ll_ost_create00 3.10.0-7.5-debug #1 SMP Sun Jun 3 13:35:38 EDT 2018 [ 2800.360945] Call Trace: [ 2800.360972] [<ffffffffa01cd7dc>] libcfs_call_trace+0x8c/0xc0 [libcfs] [ 2800.360977] [<ffffffffa01cd88c>] lbug_with_loc+0x4c/0xa0 [libcfs] [ 2800.361073] [<ffffffffa0bbbf7c>] tgt_grant_sanity_check+0x51c/0x550 [ptlrpc] [ 2800.361090] [<ffffffffa127dc14>] ofd_statfs+0x104/0x480 [ofd] [ 2800.361093] [<ffffffffa12712e0>] ofd_statfs_hdl+0x70/0x280 [ofd] [ 2800.361125] [<ffffffffa0ba0705>] tgt_request_handle+0xaf5/0x1590 [ptlrpc] [ 2800.361152] [<ffffffffa0b44e26>] ptlrpc_server_handle_request+0x256/0xad0 [ptlrpc] [ 2800.361190] [<ffffffffa0b48c1e>] ptlrpc_main+0xabe/0x1f80 [ptlrpc] [ 2800.361196] [<ffffffff810ae864>] kthread+0xe4/0xf0 [ 2800.361200] [<ffffffff81783777>] ret_from_fork_nospec_end+0x0/0x39 [ 2800.361203] [<ffffffffffffffff>] 0xffffffffffffffff |
| Comments |
| Comment by Oleg Drokin [ 28/Aug/18 ] |
|
First recorded failure of this is on July 27th. |
| Comment by Peter Jones [ 30/Aug/18 ] |
|
Bobijam Is this related to your first Peter |
| Comment by Zhenyu Xu [ 31/Aug/18 ] |
|
|
| Comment by Oleg Drokin [ 31/Aug/18 ] |
|
ok, thanks. I'll try to revert that patch locally and see if it makes any difference. |
| Comment by Peter Jones [ 04/Sep/18 ] |
|
Oleg has confirmed that reverting the patch to enable grant shrink stops this failure appearing in his testing. However, based on Bobijam's analysis that means that the bug is still there, just not being exposed. |
| Comment by Peter Jones [ 15/Sep/18 ] |
|
Bobijam Have you been able to make any progress on identifying the problem with the grant shrink algorithm? Peter |
| Comment by Zhenyu Xu [ 17/Sep/18 ] |
|
not yet, this code path hasn't been used for a long time, and I'm not familiar with grant as well. |
| Comment by Peter Jones [ 17/Sep/18 ] |
|
Ok then let's have Alex handle this - thanks for your analysis so far! |
| Comment by Alex Zhuravlev [ 18/Sep/18 ] |
|
any details on how to reproduce that?
|
| Comment by Gerrit Updater [ 24/Sep/18 ] |
|
Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33226 |
| Comment by Gerrit Updater [ 29/Oct/18 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33226/ |
| Comment by Peter Jones [ 29/Oct/18 ] |
|
Landed for 2.12 |