Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5902

replay-dual test_20: FAIL: recovery time is growing 215 > 107

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • Lustre 2.7.0, Lustre 2.5.4
    • None
    • Lustre Build: https://build.hpdd.intel.com/job/lustre-b2_5/100/
      Distro/Arch: RHEL6.5/x86_64
      MDSCOUNT=2
    • 3
    • 16488

    Description

      replay-dual test_20 failed as follows:

      Starting client: onyx-31vm6.onyx.hpdd.intel.com: -o user_xattr,flock onyx-31vm3@tcp:/lustre /mnt/lustre2
      CMD: onyx-31vm6.onyx.hpdd.intel.com mkdir -p /mnt/lustre2
      CMD: onyx-31vm6.onyx.hpdd.intel.com mount -t lustre -o user_xattr,flock onyx-31vm3@tcp:/lustre /mnt/lustre2
       replay-dual test_20: @@@@@@ FAIL: recovery time is growing 215 > 107 
      

      Maloo report: https://testing.hpdd.intel.com/test_sets/c448fd34-68a4-11e4-a63a-5254006e85c2

      Attachments

        Issue Links

          Activity

            [LU-5902] replay-dual test_20: FAIL: recovery time is growing 215 > 107
            pjones Peter Jones added a comment -

            As per Yu Jian this can be closed as a duplicate of LU-5079

            pjones Peter Jones added a comment - As per Yu Jian this can be closed as a duplicate of LU-5079
            yujian Jian Yu added a comment -

            Hi Tappro,

            I saw that replay-dual test 20 was added by you:

            commit e94350fb29ff57e72de8b03aebcabafb56b2b722
            Author: tappro <tappro>
            Date:   Mon Nov 3 13:34:52 2008 +0000
            
                - fix recovery time growing
                  b:16389
                  i:rread,nathan
            

            The comparison codes in the test are as follows:

                [ $TIER2 -ge $((TIER1 * 2)) ] && \
                    error "recovery time is growing $TIER2 > $TIER1"
            

            While the current error messages are:

            recovery time is growing 208 > 102
            recovery time is growing 216 > 106
            recovery time is growing 181 > 90
            recovery time is growing 207 > 103
            recovery time is growing 247 > 122
            recovery time is growing 217 > 106
            recovery time is growing 208 > 102
            recovery time is growing 177 > 88
            recovery time is growing 218 > 109
            recovery time is growing 214 > 103
            recovery time is growing 215 > 107
            recovery time is growing 217 > 108
            recovery time is growing 180 > 90
            

            Was there a specific rule to define $((TIER1 * 2)) ? Can we increase this value?

            yujian Jian Yu added a comment - Hi Tappro, I saw that replay-dual test 20 was added by you: commit e94350fb29ff57e72de8b03aebcabafb56b2b722 Author: tappro <tappro> Date: Mon Nov 3 13:34:52 2008 +0000 - fix recovery time growing b:16389 i:rread,nathan The comparison codes in the test are as follows: [ $TIER2 -ge $((TIER1 * 2)) ] && \ error "recovery time is growing $TIER2 > $TIER1" While the current error messages are: recovery time is growing 208 > 102 recovery time is growing 216 > 106 recovery time is growing 181 > 90 recovery time is growing 207 > 103 recovery time is growing 247 > 122 recovery time is growing 217 > 106 recovery time is growing 208 > 102 recovery time is growing 177 > 88 recovery time is growing 218 > 109 recovery time is growing 214 > 103 recovery time is growing 215 > 107 recovery time is growing 217 > 108 recovery time is growing 180 > 90 Was there a specific rule to define $((TIER1 * 2)) ? Can we increase this value?
            yujian Jian Yu added a comment -

            It was the patches http://review.whamcloud.com/11213 (master) and http://review.whamcloud.com/12365 (b2_5) for LU-5079 that caused the regressions.

            yujian Jian Yu added a comment - It was the patches http://review.whamcloud.com/11213 (master) and http://review.whamcloud.com/12365 (b2_5) for LU-5079 that caused the regressions.
            yujian Jian Yu added a comment -

            The same regression failure also occurred on master branch:
            https://testing.hpdd.intel.com/test_sets/70704a84-6721-11e4-987b-5254006e85c2

            yujian Jian Yu added a comment - The same regression failure also occurred on master branch: https://testing.hpdd.intel.com/test_sets/70704a84-6721-11e4-987b-5254006e85c2
            yujian Jian Yu added a comment -

            This is a regression failure introduced by Lustre b2_5 build #100.

            Here is a for-test-only patch trying to reproduce the failure on Lustre b2_5 build #100: http://review.whamcloud.com/12669

            yujian Jian Yu added a comment - This is a regression failure introduced by Lustre b2_5 build #100. Here is a for-test-only patch trying to reproduce the failure on Lustre b2_5 build #100: http://review.whamcloud.com/12669

            People

              wc-triage WC Triage
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: