[LU-9290] max_pages_per_rpc can't be smaller than ZFS recordsize Created: 04/Apr/17 Updated: 20/Feb/19 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Erich Focht | Assignee: | Nathaniel Clark |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre 2.9.0 |
||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
At a customer we've hit
lustre/obdclass/lprocfs_status.c: osc_obd_max_pages_per_rpc_seq_write()
chunk_mask = ~((1 << (cli->cl_chunkbits - PAGE_CACHE_SHIFT)) - 1); /* max_pages_per_rpc must be chunk aligned */ val = (val + ~chunk_mask) & chunk_mask; if (val == 0 || (ocd->ocd_brw_size != 0 && val > ocd->ocd_brw_size >> PAGE_CACHE_SHIFT)) { LPROCFS_CLIMP_EXIT(dev); return -ERANGE; } chunkbits is 20. It is set in lustre/osc/osc_request.c:osc_init_grant() to cli->cl_chunkbits = max_t(int, PAGE_SHIFT, ocd->ocd_grant_blkbits);
and ocd_grant_blkbits is set to 20. It's comment line says: /* log2 of the backend filesystem blocksize */ Once I've reduced the ZFS recordsize from 1MB to 512kB and remounted the OST, I I believe that the value set in ocd_grant_blkbits is wrong, and actually should be the ZFS ashif value (i.e. the block size) and not the recordsize. |
| Comments |
| Comment by Erich Focht [ 04/Apr/17 ] |
|
The value seems to come from lustre/osd-zfs/osd_object.c:osd_mkreg() rc = -dmu_object_set_blocksize(osd->od_os, db->db_object,
osd->od_max_blksz, 0, oh->ot_tx);
|
| Comment by Peter Jones [ 04/Apr/17 ] |
|
Nathaniel Could you please advise with this one? Thanks Peter |
| Comment by Erich Focht [ 04/Apr/17 ] |
|
Ignore my previous comment. I have no clue, yet, where the value is being set. |
| Comment by Jinshan Xiong (Inactive) [ 06/Apr/17 ] |
|
Hi Erich, The chunk size comes from ofd_brw_size, which is deduced from ofd_block_bits and then od_max_blksz of ZFS. The reason it has this restriction is because ZFS has huge penalty of doing partial record size writing. May I ask why the customer would like to set max_pages_per_rpc to be less than record size? Essentially this will cause every single write to ZFS will be less than a record size. |
| Comment by Erich Focht [ 06/Apr/17 ] |
|
Hi Jinshan, the issue we have is NEC-37 . We cannot apply the workaround described in Best regards
|