Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6844

replay-single test 70b failure: 'rundbench load on * failed!'

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.9.0
    • Lustre 2.8.0, Lustre 2.9.0
    • None
    • review-dne-part-2 test group
    • 3
    • 9223372036854775807

    Description

      replay-single test 70b fails on rename in review-dne-part-2 test sessions. Logs are at:

      2015-07-12 00:40:18 - https://testing.hpdd.intel.com/test_sets/4d9d6272-2877-11e5-8d7f-5254006e85c2
      2015-07-13 07:41:04 - https://testing.hpdd.intel.com/test_sets/21c48224-2977-11e5-a9c5-5254006e85c2
      2015-07-13 12:08:49 - https://testing.hpdd.intel.com/test_sets/7a85b58a-29aa-11e5-b07d-5254006e85c2

      Although this failures looks like LU-4439, the error message in the client test log is different:

      onyx-30vm5: [4766] rename ./clients/client0/~dmtmp/WORD/~WRD3497.TMP ./clients/client0/~dmtmp/WORD/TIPS.DOC failed (No such file or directory) - expected NT_STATUS_OK
      onyx-30vm5: ERROR: child 0 failed at line 4766
      onyx-30vm5: Child failed with status 1
      onyx-30vm5: status        script            Total(sec) E(xcluded) S(low) 
      onyx-30vm5: ------------------------------------------------------------------------------------
      onyx-30vm5: 
      onyx-30vm5: touch: missing file operand
      onyx-30vm5: Try `touch --help' for more information.
      onyx-30vm5: mdc.lustre-MDT0002-mdc-*.mds_server_uuid in FULL state after 4 sec
      onyx-30vm6: mdc.lustre-MDT0002-mdc-*.mds_server_uuid in FULL state after 4 sec
      onyx-30vm6:    1      4685     0.19 MB/sec  execute  82 sec  latency 22735.691 ms
      onyx-30vm6:    1      5047     0.23 MB/sec  execute  83 sec  latency 559.323 ms
      CMD: onyx-30vm5,onyx-30vm6.onyx.hpdd.intel.com killall -0 dbench
      onyx-30vm5: dbench: no process killed
       replay-single test_70b: @@@@@@ FAIL: dbench stopped on some of onyx-30vm5,onyx-30vm6.onyx.hpdd.intel.com! 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:4727:error_noexit()
        = /usr/lib64/lustre/tests/test-framework.sh:4758:error()
        = /usr/lib64/lustre/tests/replay-single.sh:2099:test_70b()
        = /usr/lib64/lustre/tests/test-framework.sh:5020:run_one()
        = /usr/lib64/lustre/tests/test-framework.sh:5057:run_one_logged()
        = /usr/lib64/lustre/tests/test-framework.sh:4907:run_test()
        = /usr/lib64/lustre/tests/replay-single.sh:2101:main()
      Dumping lctl log to /logdir/test_logs/2015-07-11/lustre-reviews-el6_6-x86_64--review-dne-part-2--1_5_1__33266__-70239005896920-232641/replay-single.test_70b.*.1436664129.log
      CMD: onyx-30vm3,onyx-30vm4,onyx-30vm5,onyx-30vm6.onyx.hpdd.intel.com,onyx-30vm7 /usr/sbin/lctl dk > /logdir/test_logs/2015-07-11/lustre-reviews-el6_6-x86_64--review-dne-part-2--1_5_1__33266__-70239005896920-232641/replay-single.test_70b.debug_log.\$(hostname -s).1436664129.log;
               dmesg > /logdir/test_logs/2015-07-11/lustre-reviews-el6_6-x86_64--review-dne-part-2--1_5_1__33266__-70239005896920-232641/replay-single.test_70b.dmesg.\$(hostname -s).1436664129.log
      

      Info required for matching: replay-single 70b

      Attachments

        Issue Links

          Activity

            [LU-6844] replay-single test 70b failure: 'rundbench load on * failed!'
            pjones Peter Jones added a comment -

            Landed for 2.9

            pjones Peter Jones added a comment - Landed for 2.9

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21508/
            Subject: LU-6844 tests: re-enable striped dir
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: c34013e21ae4c14cc2eac5ef58c20fee0124e51d

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21508/ Subject: LU-6844 tests: re-enable striped dir Project: fs/lustre-release Branch: master Current Patch Set: Commit: c34013e21ae4c14cc2eac5ef58c20fee0124e51d

            The above patch from LU-7117 has landed, the only remaining work here is re-enabling the test in patch http://review.whamcloud.com/#/c/21508/

            jgmitter Joseph Gmitter (Inactive) added a comment - The above patch from LU-7117 has landed, the only remaining work here is re-enabling the test in patch http://review.whamcloud.com/#/c/21508/
            jgmitter Joseph Gmitter (Inactive) added a comment - - edited

            For tracking purposes, the patch remaining to be landed here for the fix is from LU-7117 http://review.whamcloud.com/#/c/20940/

            and re-enabling the test is: http://review.whamcloud.com/#/c/21508/

            jgmitter Joseph Gmitter (Inactive) added a comment - - edited For tracking purposes, the patch remaining to be landed here for the fix is from LU-7117 http://review.whamcloud.com/#/c/20940/ and re-enabling the test is: http://review.whamcloud.com/#/c/21508/

            wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/21508
            Subject: LU-6844 tests: re-enable striped dir
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 2a35e51dd81cd72d166c1cf14d6a6ebe43a973ef

            gerrit Gerrit Updater added a comment - wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/21508 Subject: LU-6844 tests: re-enable striped dir Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 2a35e51dd81cd72d166c1cf14d6a6ebe43a973ef
            di.wang Di Wang (Inactive) added a comment - - edited

            According to the test, it looks like 20940 and 21088 can fix 6844. I will then make a patch to revert http://review.whamcloud.com/20022 .

            di.wang Di Wang (Inactive) added a comment - - edited According to the test, it looks like 20940 and 21088 can fix 6844. I will then make a patch to revert http://review.whamcloud.com/20022 .
            di.wang Di Wang (Inactive) added a comment - I pushed a patch http://review.whamcloud.com/19489 to see if http://review.whamcloud.com/#/c/20940/ and http://review.whamcloud.com/#/c/21088/ can fix 6844.

            looks quite similar as LU-7117. will see if they are related.

            di.wang Di Wang (Inactive) added a comment - looks quite similar as LU-7117 . will see if they are related.

            Note this bug should not be closed because of the above patch landing, which only changed the test to run on a single MDS.

            adilger Andreas Dilger added a comment - Note this bug should not be closed because of the above patch landing, which only changed the test to run on a single MDS.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20022/
            Subject: LU-6844 tests: disable DNE testing of dbench
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: ed30857c852f7cdb0a29e25a2ddb030f76f5c16b

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20022/ Subject: LU-6844 tests: disable DNE testing of dbench Project: fs/lustre-release Branch: master Current Patch Set: Commit: ed30857c852f7cdb0a29e25a2ddb030f76f5c16b

            People

              di.wang Di Wang (Inactive)
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: