[LU-4755] ASSERTION( req->rq_reqbuf_len >= msgsize ) failed when using 4MB RPC Created: 12/Mar/14 Updated: 17/Sep/15 Resolved: 16/Jul/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.7.0, Lustre 2.5.3 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Li Xi (Inactive) | Assignee: | Cliff White (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 13084 | ||||||||
| Description |
|
After ran "lctl set_param osc.*.max_pages_per_rpc=4M", we ran 4k random write test with IOR and easily hit this problem. After some test, we found that the biggest value of msgsize is 16384 for bulk write, while req->rq_reqbuf_len is only 8192. |
| Comments |
| Comment by Li Xi (Inactive) [ 12/Mar/14 ] |
|
With following patch, this problem won't be reproduced. http://review.whamcloud.com/9599 However, I am wondering whether there is any automatic way to calculate the proper value of OST_MAXREQSIZE at compile time rather than doing guesswork or manual caculation. |
| Comment by Peter Jones [ 14/Mar/14 ] |
|
Cliff Could you please take care of this patch? Thanks Peter |
| Comment by Andreas Dilger [ 21/Mar/14 ] |
|
Actually, before I dismiss the patch to increase the request size, it is worthwhile to ask if there is any performance improvement from sending 4096 random pages in one RPC compared to 16 x 256 random pages in separate RPCs? It might even be faster to send parallel RPCs due to checksums running on separate cores and being handled in parallel on the OST. If there is no improvement from many random pages in one RPC, it is better to just limit the number of niobufs that the client sends in one RPC. |
| Comment by Cliff White (Inactive) [ 23/Jun/14 ] |
|
The patch is failing sanity test at this time. Is it possible to submit a new patch? |
| Comment by Jodi Levi (Inactive) [ 16/Jul/14 ] |
|
Patch landed to Master. |