Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.12.1
-
None
-
RHEL 7.6
-
3
-
9223372036854775807
Description
See below for stripe details on the file "mirror10". If OST idx 1 is unmounted and made unavailable, performance drops down to 1/10th of expected performance. The client has to timeout on OST idx1 before it tries to read from OST idx 7. This happens for each 1MB block as that is the block size being used resulting in very poor performance.
$ lfs getstripe mirror10 mirror10 lcm_layout_gen: 5 lcm_mirror_count: 2 lcm_entry_count: 2 lcme_id: 65537 lcme_mirror_id: 1 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: EOF lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 1 lmm_pool: 01 lmm_objects: - 0: { l_ost_idx: 1, l_fid: [0x100010000:0x280a8:0x0] } lcme_id: 131074 lcme_mirror_id: 2 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: EOF lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 7 lmm_pool: 02 lmm_objects: - 0: { l_ost_idx: 7, l_fid: [0x100070000:0x28066:0x0] }
The patch was reverted because it was causing frequent crashes in testing (
LU-12525).The original patch https://review.whamcloud.com/34952 "
LU-12328flr: avoid reading unhealthy mirror" should fix the original problem, but it needs to be refreshed again.