[LU-12322] negative grant and tgt_grant.c:561:tgt_grant_incoming() LBUG Created: 21/May/19 Updated: 27/May/19 Resolved: 21/May/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.7 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Stephane Thiell | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Environment: |
CentOS 7.6 |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
New LBUG tonight with 2.10.7, likely a duplicate of [219417.266382] LustreError: 281433:0:(tgt_grant.c:559:tgt_grant_incoming()) oak-OST0053: cli 0eb5afeb-9924-327f-d61d-428dac6cb441/ffff883c7ef04800 dirty 0 pend 0 grant -29360128 [219417.283969] LustreError: 281433:0:(tgt_grant.c:561:tgt_grant_incoming()) LBUG [219417.292035] Pid: 281433, comm: ll_ost00_045 3.10.0-693.2.2.el7_lustre.pl3.x86_64 #1 SMP Thu Mar 15 13:06:45 PDT 2018 [219417.303881] Call Trace: [219417.306716] [<ffffffff8103a212>] save_stack_trace_tsk+0x22/0x40 [219417.313560] [<ffffffffc08087cc>] libcfs_call_trace+0x8c/0xc0 [libcfs] [219417.320982] [<ffffffffc080887c>] lbug_with_loc+0x4c/0xa0 [libcfs] [219417.328013] [<ffffffffc0bf81b0>] tgt_grant_prepare_read+0x0/0x3b0 [ptlrpc] [219417.335979] [<ffffffffc0bf82bb>] tgt_grant_prepare_read+0x10b/0x3b0 [ptlrpc] [219417.344121] [<ffffffffc119df6d>] ofd_set_info_hdl+0x23d/0x4a0 [ofd] [219417.351343] [<ffffffffc0bda115>] tgt_request_handle+0x925/0x1370 [ptlrpc] [219417.359202] [<ffffffffc0b82dd6>] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [219417.367930] [<ffffffffc0b86512>] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [219417.375101] [<ffffffff810b098f>] kthread+0xcf/0xe0 [219417.380671] [<ffffffff816b4f58>] ret_from_fork+0x58/0x90 [219417.386823] [<ffffffffffffffff>] 0xffffffffffffffff [219417.392529] Kernel panic - not syncing: LBUG prior to this, a lot of evictions and network errors [218920.410034] LustreError: 257942:0:(events.c:449:server_bulk_callback()) event type 5, status -125, desc ffff881d0b310a00 [218923.029069] LustreError: 257941:0:(events.c:449:server_bulk_callback()) event type 5, status -125, desc ffff882566547000 [218928.063611] LustreError: 257941:0:(events.c:449:server_bulk_callback()) event type 5, status -125, desc ffff880050289600 [218957.709743] Lustre: oak-OST004d: haven't heard from client ed27e8aa-82e0-d7bd-37eb-95d83ba476b8 (at 10.8.7.32@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883c83a64000, cur 1558412848 expire 1558412698 last 1558412621
Attaching vmcore-dmesg, output of basic crash commands and foreach bt. Please note that (just in case) I didn't have the patch for |
| Comments |
| Comment by Peter Jones [ 21/May/19 ] | ||||||||
|
Hi Stephane We reviewed the ticket and agree that this looks to be a duplicate of Peter | ||||||||
| Comment by Stephane Thiell [ 25/May/19 ] | ||||||||
|
I got the same kind of crash with 2.12 today. Exact same scenario than with 2.10: a bunch of Bulk IO error, event type 5, status -125, negative grant, then crash. More than 5 OSS crashed today on Fir. We had to remount the OSTs with -o abort_recov. Then everything went back to normal. | ||||||||
| Comment by Stephane Thiell [ 25/May/19 ] | ||||||||
|
Peter, [Sat May 25 07:46:41 2019]fir-io2-s1 login: [ 3622.566306] Lustre: fir-OST000e: haven't heard from client df23de56-dd2e-8a63-a8d5-338686a662ec (at 10.8.17.18@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9530d128d000, cur 1558795971 expire 1558795821 last 1558795744^M [Sat May 25 08:04:13 2019][ 4304.680584] Lustre: fir-OST000c: Connection restored to d7e6489c-4ac3-8ad5-951f-ff4462d87a95 (at 10.8.15.2@o2ib6)^M [Sat May 25 08:04:13 2019][ 4304.690868] Lustre: Skipped 17 previous similar messages^M [Sat May 25 08:05:29 2019][ 4380.571168] Lustre: fir-OST0014: haven't heard from client 25c68a06-73fb-c817-93ec-98f34b969f96 (at 10.8.23.14@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff952a0dde9c00, cur 1558796729 expire 1558796579 last 1558796502^M [Sat May 25 08:05:29 2019][ 4380.592942] Lustre: Skipped 53 previous similar messages^M [Sat May 25 08:20:16 2019][ 5267.591216] Lustre: fir-OST000c: Connection restored to 25c68a06-73fb-c817-93ec-98f34b969f96 (at 10.8.23.14@o2ib6)^M [Sat May 25 08:20:16 2019][ 5267.601571] Lustre: Skipped 5 previous similar messages^M [Sat May 25 08:36:11 2019][ 6223.053938] Lustre: fir-OST000c: Connection restored to f0ba9527-4000-78d5-07ea-44ae00898a99 (at 10.8.15.5@o2ib6)^M [Sat May 25 08:36:11 2019][ 6223.064213] Lustre: Skipped 5 previous similar messages^M [Sat May 25 08:36:36 2019][ 6247.582854] Lustre: fir-OST000c: Connection restored to e267eed0-45c3-44be-99a6-206acecf07ea (at 10.8.15.6@o2ib6)^M [Sat May 25 08:36:36 2019][ 6247.593114] Lustre: Skipped 5 previous similar messages^M [Sat May 25 08:40:43 2019][ 6495.331104] Lustre: fir-OST000c: Connection restored to 85720600-89f3-ddb2-047a-1b853ce6c3da (at 10.8.21.24@o2ib6)^M [Sat May 25 08:40:43 2019][ 6495.341461] Lustre: Skipped 5 previous similar messages^M [Sat May 25 08:41:10 2019][ 6521.769152] Lustre: fir-OST000c: Connection restored to b8bfc82f-1f8e-8596-4178-06354c0fe0f1 (at 10.8.17.15@o2ib6)^M [Sat May 25 08:41:10 2019][ 6521.779522] Lustre: Skipped 5 previous similar messages^M [Sat May 25 08:41:19 2019][ 6531.255514] Lustre: fir-OST000c: Connection restored to e6c11c37-f182-6c12-4d0c-27449ff35ca8 (at 10.8.17.16@o2ib6)^M [Sat May 25 08:41:19 2019][ 6531.265922] Lustre: Skipped 5 previous similar messages^M [Sat May 25 08:41:37 2019][ 6548.836936] Lustre: fir-OST000c: Connection restored to 834edbb7-1073-058c-c5af-7281bda3b502 (at 10.8.17.17@o2ib6)^M [Sat May 25 08:41:37 2019][ 6548.847289] Lustre: Skipped 9 previous similar messages^M [Sat May 25 08:41:56 2019][ 6568.361589] Lustre: fir-OST000c: Connection restored to 29c83ccc-4cd0-0a75-8213-b42166238c97 (at 10.8.17.19@o2ib6)^M [Sat May 25 08:41:56 2019][ 6568.371960] Lustre: Skipped 17 previous similar messages^M [Sat May 25 08:42:49 2019][ 6620.986599] Lustre: fir-OST000c: Connection restored to f70dbc8f-f692-ceed-bbff-b6639edcae72 (at 10.9.113.8@o2ib4)^M [Sat May 25 08:42:49 2019][ 6620.996948] Lustre: Skipped 5 previous similar messages^M [Sat May 25 08:44:04 2019][ 6696.251816] Lustre: fir-OST000c: Connection restored to dc489b81-8258-354a-1cb8-0896baf24eea (at 10.8.12.24@o2ib6)^M [Sat May 25 08:44:04 2019][ 6696.262177] Lustre: Skipped 17 previous similar messages^M [Sat May 25 08:47:59 2019][ 6931.536750] Lustre: fir-OST000c: Connection restored to dca21337-fdba-5128-347e-592b37646902 (at 10.9.108.60@o2ib4)^M [Sat May 25 08:47:59 2019][ 6931.547220] Lustre: Skipped 5 previous similar messages^M [Sat May 25 08:55:44 2019][ 7396.501593] LustreError: 79137:0:(tgt_grant.c:563:tgt_grant_incoming()) fir-OST0012: cli ebef6758-802b-3d88-0fb7-39f9e3a97c72/ffff952bb4d08000 dirty 0 pend 0 grant -29360128^M [Sat May 25 08:55:44 2019][ 7396.517073] LustreError: 79137:0:(tgt_grant.c:565:tgt_grant_incoming()) LBUG^M [Sat May 25 08:55:44 2019][ 7396.524130] Pid: 79137, comm: ll_ost_io02_069 3.10.0-957.1.3.el7_lustre.x86_64 #1 SMP Fri Dec 7 14:50:35 PST 2018^M [Sat May 25 08:55:44 2019][ 7396.534427] Call Trace:^M [Sat May 25 08:55:44 2019][ 7396.536895] [<ffffffffc0c217cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]^M [Sat May 25 08:55:44 2019][ 7396.543496] [<ffffffffc0c2187c>] lbug_with_loc+0x4c/0xa0 [libcfs]^M [Sat May 25 08:55:44 2019][ 7396.549728] [<ffffffffc1475d20>] tgt_grant_prepare_read+0x0/0x3b0 [ptlrpc]^M [Sat May 25 08:55:44 2019][ 7396.556804] [<ffffffffc1475e2b>] tgt_grant_prepare_read+0x10b/0x3b0 [ptlrpc]^M [Sat May 25 08:55:44 2019][ 7396.564043] [<ffffffffc1861c00>] ofd_preprw+0x450/0x1160 [ofd]^M [Sat May 25 08:55:44 2019][ 7396.569998] [<ffffffffc1459bab>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]^M [Sat May 25 08:55:44 2019][ 7396.576479] [<ffffffffc145873a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]^M [Sat May 25 08:55:45 2019][ 7396.583429] [<ffffffffc13fcd0b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]^M [Sat May 25 08:55:45 2019][ 7396.591167] [<ffffffffc140063c>] ptlrpc_main+0xafc/0x1fc0 [ptlrpc]^M [Sat May 25 08:55:45 2019][ 7396.597506] [<ffffffffabcc1c31>] kthread+0xd1/0xe0^M [Sat May 25 08:55:45 2019][ 7396.602421] [<ffffffffac374c24>] ret_from_fork_nospec_begin+0xe/0x21^M [Sat May 25 08:55:45 2019][ 7396.608915] [<ffffffffffffffff>] 0xffffffffffffffff^M [Sat May 25 08:55:45 2019][ 7396.613949] Kernel panic - not syncing: LBUG^M A patch to avoid the OSS LBUG would be a good first step I guess. | ||||||||
| Comment by Peter Jones [ 25/May/19 ] | ||||||||
|
Sure. We'll get someone who is not out for Memorial Day long weekend to look at | ||||||||
| Comment by Quentin Bouget [ 27/May/19 ] | ||||||||
|
Hello, I've looked at the code a little, and here is what I found (I hope it helps): I suppose this is now obvious to anyone who read Stephane's description: the LBUG is triggered by a negative value of ted_grant. I first thought it happened in tgt_grant_incoming() itself, because of this line:
ted->ted_grant -= dropped;
But this assignment is actually protected:
dropped = oa->oa_dropped;
...
if (ted->ted_grant < dropped) {
CDEBUG(D_CACHE, ...);
dropped = 0;
}
...
ted->ted_grant -= dropped;
Guided by LU-6877, I looked for a potential error during an arithmetic conversion:
I could not find any (though that looked promising at first with long being at least an int32_t and long long an int64_t). From there I concluded that ted->ted_grant must be set to a negative value elsewhere/earlier. Repeating a similar analysis, I could only find one place where this can happen [*]: static void tgt_grant_shrink(struct obd_export *exp, struct obdo *oa, u64 left_space) { struct tg_export_data *ted = &exp->exp_target_data; struct obd_device *obd = exp->exp_obd; struct tg_grants_data *tgd = &obd->u.obt.obt_lut->lut_tgd; long grant_shrink; assert_spin_locked(&tgd->tgd_grant_lock); LASSERT(exp); if (left_space >= tgd->tgd_tot_granted_clients * TGT_GRANT_SHRINK_LIMIT(exp)) return; grant_shrink = oa->o_grant; ted->ted_grant -= grant_shrink; tgd->tgd_tot_granted -= grant_shrink; CDEBUG(D_CACHE, "%s: cli %s/%p shrink %ld ted_grant %ld total %llu\n", obd->obd_name, exp->exp_client_uuid.uuid, exp, grant_shrink, ted->ted_grant, tgd->tgd_tot_granted); /* client has just released some grant, don't grant any space back */ oa->o_grant = 0; } All the other places where ted_grant is set seem to be protected with appropriate checks. Now tgt_grant_shrink() is only called in two places: tgt_grant_prepare_read() and tgt_grant_prepare_write(). The odd part is that in tgt_grant_prepare_read() it is called after tgt_grant_incoming() (the function that raises the LBUG). My understanding is that there is a workqueue on the client side that periodically updates a client's grant size (cf. ost_grant_work_handler()). The update RPC is sent asynchronously and a callback is set to process the server's response: static int osc_shrink_grant_interpret(const struct lu_env *env, struct ptlrpc_request *req, void *args, int rc) { struct osc_grant_args *aa = args; struct client_obd *cli = &req->rq_import->imp_obd->u.cli; struct ost_body *body; if (rc != 0) { __osc_update_grant(cli, aa->aa_oa->o_grant); GOTO(out, rc); } body = req_capsule_server_get(&req->rq_pill, &RMF_OST_BODY); LASSERT(body); osc_update_grant(cli, body); out: OBD_SLAB_FREE_PTR(aa->aa_oa, osc_obdo_kmem); return rc; } ... int osc_set_info_async(...) { ... if (KEY_IS(KEY_GRANT_SHRINK)) { ... req->rq_interpret_reply = osc_shrink_grant_interpret; } ... } So at some point the client probably receives a bogus value from the server, and the next RPC triggers the LBUG. Where to go from there? I could not figure out where exp->exp_target_data comes from in tgt_grant_shrink(). If someone can, we should be able to find the root cause of all this. — [*] I only looked at places in the code where ted_grant is explicitly set (grep -R 'ted->ted_grant\s+[-+]?=' lustre). However, it is possible that somewhere, the whole ted structure is set at once and I missed it (for example during RPC packing / unpacking). |