Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12328

FLR mirroring on 2.12.1-1 not usable if OST is down

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.13.0, Lustre 2.12.4
    • Lustre 2.12.1
    • None
    • RHEL 7.6
    • 3
    • 9223372036854775807

    Description

      See below for stripe details on the file "mirror10". If OST idx 1 is unmounted and made unavailable, performance drops down to 1/10th of expected performance. The client has to timeout on OST idx1 before it tries to read from OST idx 7. This happens for each 1MB block as that is the block size being used resulting in very poor performance. 

       

       

      $ lfs getstripe mirror10
      mirror10
       lcm_layout_gen: 5
       lcm_mirror_count: 2
       lcm_entry_count: 2
       lcme_id: 65537
       lcme_mirror_id: 1
       lcme_flags: init
       lcme_extent.e_start: 0
       lcme_extent.e_end: EOF
       lmm_stripe_count: 1
       lmm_stripe_size: 1048576
       lmm_pattern: raid0
       lmm_layout_gen: 0
       lmm_stripe_offset: 1
       lmm_pool: 01
       lmm_objects:
       - 0: { l_ost_idx: 1, l_fid: [0x100010000:0x280a8:0x0] }
      
      lcme_id: 131074
       lcme_mirror_id: 2
       lcme_flags: init
       lcme_extent.e_start: 0
       lcme_extent.e_end: EOF
       lmm_stripe_count: 1
       lmm_stripe_size: 1048576
       lmm_pattern: raid0
       lmm_layout_gen: 0
       lmm_stripe_offset: 7
       lmm_pool: 02
       lmm_objects:
       - 0: { l_ost_idx: 7, l_fid: [0x100070000:0x28066:0x0] }
      

      Attachments

        Issue Links

          Activity

            [LU-12328] FLR mirroring on 2.12.1-1 not usable if OST is down

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36550/
            Subject: LU-12328 flr: avoid reading unhealthy mirror
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 02affb11d4162f23eadef7e0ed15982e11005a41

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36550/ Subject: LU-12328 flr: avoid reading unhealthy mirror Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 02affb11d4162f23eadef7e0ed15982e11005a41

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36550
            Subject: LU-12328 flr: avoid reading unhealthy mirror
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 95b0b3d9aa1d0de120e788eef96d4a1f42ff9d6c

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36550 Subject: LU-12328 flr: avoid reading unhealthy mirror Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 95b0b3d9aa1d0de120e788eef96d4a1f42ff9d6c
            pjones Peter Jones added a comment -

            Landed for 2.13

            pjones Peter Jones added a comment - Landed for 2.13

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34952/
            Subject: LU-12328 flr: avoid reading unhealthy mirror
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 39da3c06275e04e2a6e7f055cb27ee9dff1ea576

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34952/ Subject: LU-12328 flr: avoid reading unhealthy mirror Project: fs/lustre-release Branch: master Current Patch Set: Commit: 39da3c06275e04e2a6e7f055cb27ee9dff1ea576
            raot Joe Frith added a comment -

            I did a quick test and the patch seems to work as expected, read performance is as expected with an unhealthy mirror. I did not however run compressive tests.  

            raot Joe Frith added a comment - I did a quick test and the patch seems to work as expected, read performance is as expected with an unhealthy mirror. I did not however run compressive tests.  

            Tejas, I don't think this patch will make it into 2.12.3, because it hasn't yet landed to master.

            However, I think the current version of the patch is in good enough shape for you to test. It would be useful if you could give the latest patch a try and let us know if this is working for you.

            adilger Andreas Dilger added a comment - Tejas, I don't think this patch will make it into 2.12.3, because it hasn't yet landed to master. However, I think the current version of the patch is in good enough shape for you to test. It would be useful if you could give the latest patch a try and let us know if this is working for you.
            raot Joe Frith added a comment -

            Any chance this will get included in the 2.12.3? We are stuck and cannot migrate due to this issue. 

            raot Joe Frith added a comment - Any chance this will get included in the 2.12.3? We are stuck and cannot migrate due to this issue. 

            The patch was reverted because it was causing frequent crashes in testing (LU-12525).

            The original patch https://review.whamcloud.com/34952 "LU-12328 flr: avoid reading unhealthy mirror" should fix the original problem, but it needs to be refreshed again.

            adilger Andreas Dilger added a comment - The patch was reverted because it was causing frequent crashes in testing ( LU-12525 ). The original patch https://review.whamcloud.com/34952 " LU-12328 flr: avoid reading unhealthy mirror " should fix the original problem, but it needs to be refreshed again.
            raot Joe Frith added a comment -

            Did the patch get reverted after being included in the master? 

             

            We are still hoping that this issue gets resolved so we can go ahead with the maintenance. 

            raot Joe Frith added a comment - Did the patch get reverted after being included in the master?    We are still hoping that this issue gets resolved so we can go ahead with the maintenance. 

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35450/
            Subject: Revert "LU-12328 flr: preserve last read mirror"
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 0a8750628d9a87f686b917c88e42093a52a78ae3

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35450/ Subject: Revert " LU-12328 flr: preserve last read mirror" Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0a8750628d9a87f686b917c88e42093a52a78ae3

            People

              bobijam Zhenyu Xu
              raot Joe Frith
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: