Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
we are testing FLR setup and found out lfs setstripe could get stuck.
After creating the FLR layout and lfs mirror resync:
[root@bss022 test_file_replica]# lfs getstripe -v test-file1 test-file1 composite_header: lcm_magic: 0x0BD60BD0 lcm_size: 272 lcm_flags: ro lcm_layout_gen: 11 lcm_mirror_count: 2 lcm_entry_count: 2 components: - lcme_id: 65537 lcme_mirror_id: 1 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: EOF lcme_offset: 128 lcme_size: 72 sub_layout: lmm_magic: 0x0BD30BD0 lmm_seq: 0xa80000d30 lmm_object_id: 0x6 lmm_fid: [0xa80000d30:0x6:0x0] lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 9 lmm_pool: primary lmm_objects: - 0: { l_ost_idx: 9, l_fid: [0x440000419:0x17e2:0x0] } - lcme_id: 131074 lcme_mirror_id: 2 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: EOF lcme_offset: 200 lcme_size: 72 sub_layout: lmm_magic: 0x0BD30BD0 lmm_seq: 0xa80000d30 lmm_object_id: 0x6 lmm_fid: [0xa80000d30:0x6:0x0] lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 13 lmm_pool: secondary lmm_objects: - 0: { l_ost_idx: 13, l_fid: [0x2800000408:0x1802:0x0] }
after putting ost9 offline we are still able to read the file.
but in order to be able to write the file we need to set the preferred flag on the other component:
lfs setstripe --comp-set -I 131074 --comp-flags=prefer test-file1
however it will get stuck because lfs is trying to flush the client cache to ost9 which is offline.
[<ffffffffc1433f94>] osc_io_data_version_end+0x34/0x190 [osc] [<ffffffffc0fc4ee0>] cl_io_end+0x60/0x150 [obdclass] [<ffffffffc0e0a0bb>] lov_io_end_wrapper+0xdb/0xe0 [lov] [<ffffffffc0e0ad38>] lov_io_data_version_end+0x78/0x1d0 [lov] [<ffffffffc0fc4ee0>] cl_io_end+0x60/0x150 [obdclass] [<ffffffffc0fc779a>] cl_io_loop+0xda/0x1c0 [obdclass] [<ffffffffc1513bcb>] ll_ioc_data_version+0x20b/0x340 [lustre] [<ffffffffc15283e0>] ll_file_ioctl+0x19d0/0x49f0 [lustre] [<ffffffffb665d9e0>] do_vfs_ioctl+0x3a0/0x5a0 [<ffffffffb665dc81>] SyS_ioctl+0xa1/0xc0 [<ffffffffb6b8cede>] system_call_fastpath+0x25/0x2a [<ffffffffffffffff>] 0xffffffffffffffff
I don't think the user should have to set the preferred mirror when writing to an FLR file with a failed OST. Definitely the MDS should automatically pick a mirror that is not missing objects to avoid this problem.
In some cases, there may be a race condition where an OST goes offline right after the MDS selected it for a mirror, but I don't think applies here. If a user noticed the problem and has time to run "lfs setstripe" then the MDS has had lots of time to detect the problem itself and skip the mirror with that OST.