Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1458

lustre-rsync-test test_2b: old lustre_rsync does not work with new llog_changelog_ext_rec remove changelog

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.4.0, Lustre 2.6.0
    • Lustre 2.3.0, Lustre 2.1.2, Lustre 2.4.1, Lustre 2.5.0, Lustre 2.5.1
    • 3
    • 4107

    Description

      This issue was created by maloo for yujian <yujian@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/eb3d7ed4-ab13-11e1-8e7f-52540035b04c.

      The sub-test test_2b failed with the following error:

      Only in /mnt/lustre/d0.lustre-rsync-test/d2/clients/client1/~dmtmp/PM: PMD394.TMP
      lustre-rsync-test test_2b: @@@@@@ FAIL: Failure in replication; differences found.
      test failed to respond and timed out

      Info required for matching: lustre-rsync-test 2b

      Attachments

        Issue Links

          Activity

            [LU-1458] lustre-rsync-test test_2b: old lustre_rsync does not work with new llog_changelog_ext_rec remove changelog
            bfaccini Bruno Faccini (Inactive) added a comment - +1 on b2_5 branch : https://maloo.whamcloud.com/test_sessions/26ab637c-9b91-11e3-95f0-52540035b04c
            yujian Jian Yu added a comment - - edited More instances on Lustre b2_5 branch: https://maloo.whamcloud.com/test_sets/2d4b5c08-89b9-11e3-ae0e-52540035b04c https://maloo.whamcloud.com/test_sets/eec3c6d4-96c2-11e3-b941-52540035b04c
            bogl Bob Glossman (Inactive) added a comment - an instance in master: https://maloo.whamcloud.com/test_sets/736be8cc-7dc2-11e3-bfda-52540035b04c
            yujian Jian Yu added a comment - An instance on Lustre b2_5 branch: https://maloo.whamcloud.com/test_sets/40cbb89e-7696-11e3-8c14-52540035b04c
            yujian Jian Yu added a comment - More instance on Lustre b2_4 branch: https://maloo.whamcloud.com/test_sets/ed719c42-6356-11e3-8c76-52540035b04c
            yujian Jian Yu added a comment - Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/59/ Distro/Arch: RHEL6.4/x86_64 The same failure occurred: https://maloo.whamcloud.com/test_sets/2375f856-5817-11e3-b8c3-52540035b04c

            It seems this bug has been subverted from its original purpose of tracking a 2.1/2.4 interop problem into something unrelated that also causes test_2b to fail (dbench not starting quickly enough). It would be better to fix that problem in a separate bug, so that when the patch lands that bug can be closed, and this one is not closed.

            adilger Andreas Dilger added a comment - It seems this bug has been subverted from its original purpose of tracking a 2.1/2.4 interop problem into something unrelated that also causes test_2b to fail (dbench not starting quickly enough). It would be better to fix that problem in a separate bug, so that when the patch lands that bug can be closed, and this one is not closed.

            The spate of ZFS failures seem to be related to dbench not being started at the beginning of test 2b within the given 20s. Here is a patch to wait longer if necessary:

            http://review.whamcloud.com/7914

            utopiabound Nathaniel Clark added a comment - The spate of ZFS failures seem to be related to dbench not being started at the beginning of test 2b within the given 20s. Here is a patch to wait longer if necessary: http://review.whamcloud.com/7914
            bogl Bob Glossman (Inactive) added a comment - another https://maloo.whamcloud.com/test_sets/7c10d8f4-1e02-11e3-b42b-52540035b04c
            yujian Jian Yu added a comment - Lustre client: http://build.whamcloud.com/job/lustre-b2_4/44/ (2.4.1 RC1) Lustre server: http://build.whamcloud.com/job/lustre-b2_3/41/ (2.3.0) lustre-rsync-test test 2b failed: https://maloo.whamcloud.com/test_sets/59afba4a-1502-11e3-ba63-52540035b04c
            bfaccini Bruno Faccini (Inactive) added a comment - - edited

            +1 at https://maloo.whamcloud.com/test_sets/35af817e-0f54-11e3-9bce-52540035b04c
            and it shows the same difference of one entry between the number of Changelog entries reported during the test and in the content gathered for the test, and this for the CREATE action of the file finally reported as missing during the check.
            I wonder if it is a problem with the Changelog sync itself or why not some timing issue with the background dbench real stop before lustre_rsync start ?

            bfaccini Bruno Faccini (Inactive) added a comment - - edited +1 at https://maloo.whamcloud.com/test_sets/35af817e-0f54-11e3-9bce-52540035b04c and it shows the same difference of one entry between the number of Changelog entries reported during the test and in the content gathered for the test, and this for the CREATE action of the file finally reported as missing during the check. I wonder if it is a problem with the Changelog sync itself or why not some timing issue with the background dbench real stop before lustre_rsync start ?

            People

              bobijam Zhenyu Xu
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: