Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6841

replay-single test_30: multiop 20786 failed

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for John Hammond <john.hammond@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/3f0a5c78-27ba-11e5-b37a-5254006e85c2.

      The sub-test test_30 failed with the following error:

      multiop 20786 failed
      

      Please provide additional information about the failure here.

      Info required for matching: replay-single 30

      Attachments

        Issue Links

          Activity

            [LU-6841] replay-single test_30: multiop 20786 failed

            closed as duplicate of LU-5951

            hongchao.zhang Hongchao Zhang added a comment - closed as duplicate of LU-5951

            This issue could be marked as a duplicate of LU-5951

            hongchao.zhang Hongchao Zhang added a comment - This issue could be marked as a duplicate of LU-5951

            Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: http://review.whamcloud.com/17303
            Subject: LU-6841 target: check the last reply for resent
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 21583ba45fefda43841c36ab3609e529dbadda1d

            gerrit Gerrit Updater added a comment - Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: http://review.whamcloud.com/17303 Subject: LU-6841 target: check the last reply for resent Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 21583ba45fefda43841c36ab3609e529dbadda1d

            Status update:
            as per the logs, this issue is caused by the resent replay request during recovery, but MDT doesn't find the corresponding reply
            of the resent request, and It could be related to the Multiple Slots in last_rcvd. Still analysis the related code lines to find where
            the problem is and will update the status once there is any progress.

            hongchao.zhang Hongchao Zhang added a comment - Status update: as per the logs, this issue is caused by the resent replay request during recovery, but MDT doesn't find the corresponding reply of the resent request, and It could be related to the Multiple Slots in last_rcvd. Still analysis the related code lines to find where the problem is and will update the status once there is any progress.
            utopiabound Nathaniel Clark added a comment - - edited

            Just encountered on master but test replay-single/test_31
            https://testing.hpdd.intel.com/test_sets/7708f4da-7e14-11e5-9c23-5254006e85c2

            utopiabound Nathaniel Clark added a comment - - edited Just encountered on master but test replay-single/test_31 https://testing.hpdd.intel.com/test_sets/7708f4da-7e14-11e5-9c23-5254006e85c2
            pjones Peter Jones added a comment -

            Let's close the ticket as cannot report and reopen if we do see it again

            pjones Peter Jones added a comment - Let's close the ticket as cannot report and reopen if we do see it again

            This issue has not been seen on any branch since 7/24 and may have been resolved by related landings. We are going to reduce the severity and leave it open.

            jgmitter Joseph Gmitter (Inactive) added a comment - This issue has not been seen on any branch since 7/24 and may have been resolved by related landings. We are going to reduce the severity and leave it open.

            I don't remember having seen this test fail during my local testing.

            However, I have to admit that the kernel I use does not include the dev_read_only patch, which means the MDT device cannot be made readonly by replay_barrier. This means the test environment is not exactly the same.

            pichong Gregoire Pichon added a comment - I don't remember having seen this test fail during my local testing. However, I have to admit that the kernel I use does not include the dev_read_only patch, which means the MDT device cannot be made readonly by replay_barrier . This means the test environment is not exactly the same.

            Hi Gregoire, have you seen any problems similar to this in your testing?

            adilger Andreas Dilger added a comment - Hi Gregoire, have you seen any problems similar to this in your testing?

            Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: http://review.whamcloud.com/15814
            Subject: LU-6841 target: debug patch to collect more logs
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d8fa2f729b6dd8fa5c9ae604308f39306ddd6aa9

            gerrit Gerrit Updater added a comment - Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: http://review.whamcloud.com/15814 Subject: LU-6841 target: debug patch to collect more logs Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d8fa2f729b6dd8fa5c9ae604308f39306ddd6aa9

            People

              hongchao.zhang Hongchao Zhang
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: