[LU-15550] WBC: retry the batched RPC when the reply buffer is overflowed Created: 11/Feb/22 Updated: 16/Jun/23 Resolved: 01/May/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Qian Yingjin | Assignee: | Qian Yingjin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
Before send the batched RPC, we have no idea about the actual reply buffer size. But when the needed reply buffer size is larger than BUT_MAXREPSIZE (1000 * 1024), the server will return -EOVERFLOW error code. At this time, the server only executed the partial sub requests in the batched RPC. The overflowed sub requests will not be executed. Thus the client needs a retry mechanism: when found that the reply buffer overflowed, the client will rebuild the batched RPC for the sub requests that not executed on the server, and send to the server to re-executed them again. |
| Comments |
| Comment by Gerrit Updater [ 17/Feb/22 ] |
|
"Yingjin Qian <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46540 |
| Comment by Gerrit Updater [ 01/May/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/46540/ |
| Comment by Peter Jones [ 01/May/23 ] |
|
Landed for 2.16 |
| Comment by Alex Zhuravlev [ 09/May/23 ] |
|
please check: [ 4604.470775] Lustre: DEBUG MARKER: == sanity test 123f: Retry mechanism with large wide striping files ========================================================== 15:59:19 (1683388759) [ 4668.109917] Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x280000bd0 to 0x2800013a0 [ 4668.697696] Lustre: lustre-OST0001-osc-MDT0000: update sequence from 0x2c0000bd0 to 0x2c00013a0 [ 4791.809264] Lustre: lustre-OST0001-osc-MDT0000: update sequence from 0x2c00013a0 to 0x2c00013a1 [ 4791.859391] Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x2800013a0 to 0x2800013a1 [ 4809.801406] ------------[ cut here ]------------ [ 4809.801558] Max IOV exceeded: 257 should be < 256 [ 4809.801678] WARNING: CPU: 0 PID: 116874 at /home/lustre/master-mine/lnet/lnet/lib-md.c:257 lnet_md_build.part.1+0x4b3/0x770 [lnet] [ 4809.801847] Modules linked in: lustre(O) ofd(O) osp(O) lod(O) ost(O) mdt(O) mdd(O) mgs(O) osd_zfs(O) lquota(O) lfsck(O) obdecho(O) mgc(O) mdc(O) lov(O) osc(O) lmv(O) fid(O) fld(O) ptlrpc(O) obdclass(O) ksocklnd(O) lnet(O) libcfs(O) zfs(O) zunicode(O) zzstd(O) zlua(O) zcommon(O) znvpair(O) zavl(O) icp(O) spl(O) [last unloaded: libcfs] [ 4809.802263] CPU: 0 PID: 116874 Comm: mdt_out00_000 Tainted: G W O --------- - - 4.18.0 #2 [ 4809.802389] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 4809.802492] RIP: 0010:lnet_md_build.part.1+0x4b3/0x770 [lnet] [ 4809.802589] Code: 00 00 48 c7 05 da 55 07 00 00 00 00 00 e8 f5 88 fb ff e9 05 fc ff ff ba 00 01 00 00 89 ee 48 c7 c7 38 49 7d c0 e8 c3 d7 91 d6 <0f> 0b e9 f6 fc ff ff f6 05 4b 38 fd ff 10 49 c7 c4 f4 ff ff ff 0f [ 4809.802838] RSP: 0018:ffff972524367c10 EFLAGS: 00010282 [ 4809.802918] RAX: 0000000000000025 RBX: 0000000000000168 RCX: 0000000000000007 [ 4809.803028] RDX: 0000000000000007 RSI: 0000000000000022 RDI: ffff97256cbe5450 [ 4809.803147] RBP: 0000000000000101 R08: 000004cd62abfed5 R09: 0000000000000000 [ 4809.803257] R10: 0000000000000001 R11: 00000000ffffffff R12: ffff97247bb2a000 [ 4809.803367] R13: 0000000000000000 R14: ffff972242f00168 R15: ffff972524367c90 [ 4809.803477] FS: 0000000000000000(0000) GS:ffff97256ca00000(0000) knlGS:0000000000000000 [ 4809.803587] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4809.803679] CR2: 000055aeee75bf44 CR3: 00000003104c2000 CR4: 00000000000006b0 [ 4809.803792] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4809.803910] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4809.804020] Call Trace: [ 4809.804074] LNetMDBind+0x3e/0x380 [lnet] [ 4809.804193] ptl_send_buf+0x117/0x570 [ptlrpc] [ 4809.804307] ? at_measured+0x1eb/0x300 [ptlrpc] [ 4809.804419] ? reply_out_callback+0x390/0x390 [ptlrpc] [ 4809.804529] ptlrpc_send_reply+0x2b4/0x8f0 [ptlrpc] [ 4809.804647] target_send_reply+0x343/0x770 [ptlrpc] [ 4809.804768] tgt_request_handle+0x9ca/0x1a40 [ptlrpc] [ 4809.804885] ? lustre_msg_get_transno+0x6f/0xd0 [ptlrpc] [ 4809.804998] ptlrpc_main+0x1784/0x32a0 [ptlrpc] [ 4809.805083] ? __kthread_parkme+0x33/0x90 [ 4809.805172] ? ptlrpc_wait_event+0x4b0/0x4b0 [ptlrpc] [ 4809.805254] kthread+0x129/0x140 [ 4809.805314] ? kthread_flush_work_fn+0x10/0x10 [ 4809.805391] ret_from_fork+0x1f/0x30 [ 4809.805450] ---[ end trace 02ea061606b4429e ]--- |
| Comment by Qian Yingjin [ 09/May/23 ] |
|
Is there any stable reproducer for this panic? It seems there are something wrong when calc message buffer size... |
| Comment by Alex Zhuravlev [ 09/May/23 ] |
no, just a single hit so far, local testing. |
| Comment by Qian Yingjin [ 10/May/23 ] |
|
The panic hit on Maloo testing also... |
| Comment by Qian Yingjin [ 10/May/23 ] |
|
Just add a single line in
max = BUT_MAXREPSIZE - req->rq_replen;
if (used_len + msg_len > len)
len = used_len + msg_len;
if (len > max)
len += max;
else
len += len;
len += 10 * PAGE_SIZE; // +++++++++++
rc = req_capsule_server_grow(&req->rq_pill,
&RMF_BUT_REPLY, len);
It caused the panic: [ 6059.646103] Max IOV exceeded: 262 should be < 256 [ 6059.646267] RIP: 0010:lnet_md_build.part.10+0x525/0x790 [lnet] [ 6059.646324] LNetMDBind+0x48/0x380 [lnet] [ 6059.646408] ptl_send_buf+0x144/0x5a0 [ptlrpc] [ 6059.646436] ? ptlrpc_ni_fini+0x60/0x60 [ptlrpc] [ 6059.646461] ptlrpc_send_reply+0x2ad/0x8d0 [ptlrpc] [ 6059.646493] target_send_reply+0x324/0x7d0 [ptlrpc] [ 6059.646526] tgt_request_handle+0xe81/0x1920 [ptlrpc] [ 6059.646554] ptlrpc_server_handle_request+0x31d/0xbc0 [ptlrpc] [ 6059.646580] ? lprocfs_counter_add+0x12a/0x1a0 [obdclass] [ 6059.646607] ptlrpc_main+0xc4e/0x1510 [ptlrpc] [ 6059.646610] ? __schedule+0x2cc/0x700 [ 6059.646637] ? ptlrpc_wait_event+0x590/0x590 [ptlrpc] [ 6059.646639] kthread+0x116/0x130 [ 6059.646640] ? kthread_flush_work_fn+0x10/0x10 [ 6059.646641] ret_from_fork+0x1f/0x40 [ 6059.646642] ---[ end trace d2884932fb9123c4 ]--- The calculation for max allowed reply buffer size is incorrect. |