Details
Description
LU-12158 introduced logic to limit stride and stripe_width values during filesystem creation, but also introduced two issues:
Issue 1: Incorrect units used for IO size checks
LU-12158 compares device-reported IO sizes (in bytes) directly against hardcoded thresholds (OPTIMIZED_STRIPE_WIDTH and OPTIMIZED_STRIDE), which are incorrectly defined as raw integers without converting to file system block units. This leads to the parameters never being set unless explicitly specified during mkfs. (This unit mismatch was mentioned by Andreas in LU-18514, too)
Details
During ext4/ldiskfs filesystem creation, misc/mke2fs.c retrieves:
- /sys/block/<device>/queue/minimum_io_size
- /sys/block/<device>/queue/optimal_io_size
These values are reported in bytes. For example, a typical RAID 6 (8+2) configuration might return:
minimum_io_size: 131072 # 128 KB optimal_io_size: 1048576 # 1 MB (8 * 128 KB)
However, the stride and stripe_width parameters for ldiskfs are specified in file system blocks (typically 4 KB). Thus, the effective sizes tested in LU-12158 were:
512 blocks = 2 MB 1024 blocks = 4 MB 2048 blocks = 8 MB 4096 blocks = 16 MB
LU-12158 intended to limit stripe-related values to 2 MB, but mistakenly used raw values:
#define OPTIMIZED_STRIPE_WIDTH 512 #define OPTIMIZED_STRIDE 512
These values are compared directly to dev_param->{min,opt_io} sizes reported in bytes:
dev_param->min_io = blkid_topology_get_minimum_io_size(tp); if (dev_param->min_io > OPTIMIZED_STRIDE) { fprintf(stdout, "detected raid stride %lu too large, use optimum %u\n", dev_param->min_io, OPTIMIZED_STRIDE); dev_param->min_io = OPTIMIZED_STRIDE; } dev_param->opt_io = blkid_topology_get_optimal_io_size(tp); if (dev_param->opt_io > OPTIMIZED_STRIPE_WIDTH) { fprintf(stdout, "detected raid stripe width %lu too large, use optimum %u\n", dev_param->opt_io, OPTIMIZED_STRIPE_WIDTH); dev_param->opt_io = OPTIMIZED_STRIPE_WIDTH; }
Since even modest IO sizes (e.g., 128 KB) exceed 512, the values are always clamped to 512 bytes, which causes fs_param.s_raid_stride and s_raid_stripe_width to never be set because of this code in misc/mke2fs.c:
/* setting stripe/stride to blocksize is pointless */ if (dev_param.min_io > (unsigned) blocksize) fs_param.s_raid_stride = dev_param.min_io / blocksize; if (dev_param.opt_io > (unsigned) blocksize) { fs_param.s_raid_stripe_width = dev_param.opt_io / blocksize; }
Because 512 < 4096, the condition fails, and no RAID parameters are applied unless explicitly specified via mkfs.
Issue 2: Misleading warnings even when custom values are used
The patch also introduces warnings:
detected raid stride %lu too large, use optimum %lu detected raid stripe width %lu too large, use optimum %lu
These warnings appear unconditionally, even when custom values for stride and stripe_width are explicitly provided via mkfs. This is misleading, because in this case, the user-supplied values are used, not the "optimum" ones.
Resolution Plan
This LU will fix both issues:
- Issue1: Correct the unit comparison by converting OPTIMIZED_STRIDE and OPTIMIZED_STRIPE_WIDTH to bytes (e.g., 512 * 4096 = 2 MB) to match the units of min_io and opt_io.
- Issue 2: Suppress or conditionally display the warning messages only when the parameters are being auto-detected, not when they are explicitly specified by the user. This will fix LU-18514.