Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6844

replay-single test 70b failure: 'rundbench load on * failed!'

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.9.0
    • Lustre 2.8.0, Lustre 2.9.0
    • None
    • review-dne-part-2 test group
    • 3
    • 9223372036854775807

    Description

      replay-single test 70b fails on rename in review-dne-part-2 test sessions. Logs are at:

      2015-07-12 00:40:18 - https://testing.hpdd.intel.com/test_sets/4d9d6272-2877-11e5-8d7f-5254006e85c2
      2015-07-13 07:41:04 - https://testing.hpdd.intel.com/test_sets/21c48224-2977-11e5-a9c5-5254006e85c2
      2015-07-13 12:08:49 - https://testing.hpdd.intel.com/test_sets/7a85b58a-29aa-11e5-b07d-5254006e85c2

      Although this failures looks like LU-4439, the error message in the client test log is different:

      onyx-30vm5: [4766] rename ./clients/client0/~dmtmp/WORD/~WRD3497.TMP ./clients/client0/~dmtmp/WORD/TIPS.DOC failed (No such file or directory) - expected NT_STATUS_OK
      onyx-30vm5: ERROR: child 0 failed at line 4766
      onyx-30vm5: Child failed with status 1
      onyx-30vm5: status        script            Total(sec) E(xcluded) S(low) 
      onyx-30vm5: ------------------------------------------------------------------------------------
      onyx-30vm5: 
      onyx-30vm5: touch: missing file operand
      onyx-30vm5: Try `touch --help' for more information.
      onyx-30vm5: mdc.lustre-MDT0002-mdc-*.mds_server_uuid in FULL state after 4 sec
      onyx-30vm6: mdc.lustre-MDT0002-mdc-*.mds_server_uuid in FULL state after 4 sec
      onyx-30vm6:    1      4685     0.19 MB/sec  execute  82 sec  latency 22735.691 ms
      onyx-30vm6:    1      5047     0.23 MB/sec  execute  83 sec  latency 559.323 ms
      CMD: onyx-30vm5,onyx-30vm6.onyx.hpdd.intel.com killall -0 dbench
      onyx-30vm5: dbench: no process killed
       replay-single test_70b: @@@@@@ FAIL: dbench stopped on some of onyx-30vm5,onyx-30vm6.onyx.hpdd.intel.com! 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:4727:error_noexit()
        = /usr/lib64/lustre/tests/test-framework.sh:4758:error()
        = /usr/lib64/lustre/tests/replay-single.sh:2099:test_70b()
        = /usr/lib64/lustre/tests/test-framework.sh:5020:run_one()
        = /usr/lib64/lustre/tests/test-framework.sh:5057:run_one_logged()
        = /usr/lib64/lustre/tests/test-framework.sh:4907:run_test()
        = /usr/lib64/lustre/tests/replay-single.sh:2101:main()
      Dumping lctl log to /logdir/test_logs/2015-07-11/lustre-reviews-el6_6-x86_64--review-dne-part-2--1_5_1__33266__-70239005896920-232641/replay-single.test_70b.*.1436664129.log
      CMD: onyx-30vm3,onyx-30vm4,onyx-30vm5,onyx-30vm6.onyx.hpdd.intel.com,onyx-30vm7 /usr/sbin/lctl dk > /logdir/test_logs/2015-07-11/lustre-reviews-el6_6-x86_64--review-dne-part-2--1_5_1__33266__-70239005896920-232641/replay-single.test_70b.debug_log.\$(hostname -s).1436664129.log;
               dmesg > /logdir/test_logs/2015-07-11/lustre-reviews-el6_6-x86_64--review-dne-part-2--1_5_1__33266__-70239005896920-232641/replay-single.test_70b.dmesg.\$(hostname -s).1436664129.log
      

      Info required for matching: replay-single 70b

      Attachments

        Issue Links

          Activity

            [LU-6844] replay-single test 70b failure: 'rundbench load on * failed!'
            jgmitter Joseph Gmitter (Inactive) added a comment - - edited

            For tracking purposes, the patch remaining to be landed here for the fix is from LU-7117 http://review.whamcloud.com/#/c/20940/

            and re-enabling the test is: http://review.whamcloud.com/#/c/21508/

            jgmitter Joseph Gmitter (Inactive) added a comment - - edited For tracking purposes, the patch remaining to be landed here for the fix is from LU-7117 http://review.whamcloud.com/#/c/20940/ and re-enabling the test is: http://review.whamcloud.com/#/c/21508/

            wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/21508
            Subject: LU-6844 tests: re-enable striped dir
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 2a35e51dd81cd72d166c1cf14d6a6ebe43a973ef

            gerrit Gerrit Updater added a comment - wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/21508 Subject: LU-6844 tests: re-enable striped dir Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 2a35e51dd81cd72d166c1cf14d6a6ebe43a973ef
            di.wang Di Wang (Inactive) added a comment - - edited

            According to the test, it looks like 20940 and 21088 can fix 6844. I will then make a patch to revert http://review.whamcloud.com/20022 .

            di.wang Di Wang (Inactive) added a comment - - edited According to the test, it looks like 20940 and 21088 can fix 6844. I will then make a patch to revert http://review.whamcloud.com/20022 .
            di.wang Di Wang (Inactive) added a comment - I pushed a patch http://review.whamcloud.com/19489 to see if http://review.whamcloud.com/#/c/20940/ and http://review.whamcloud.com/#/c/21088/ can fix 6844.

            looks quite similar as LU-7117. will see if they are related.

            di.wang Di Wang (Inactive) added a comment - looks quite similar as LU-7117 . will see if they are related.

            Note this bug should not be closed because of the above patch landing, which only changed the test to run on a single MDS.

            adilger Andreas Dilger added a comment - Note this bug should not be closed because of the above patch landing, which only changed the test to run on a single MDS.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20022/
            Subject: LU-6844 tests: disable DNE testing of dbench
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: ed30857c852f7cdb0a29e25a2ddb030f76f5c16b

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20022/ Subject: LU-6844 tests: disable DNE testing of dbench Project: fs/lustre-release Branch: master Current Patch Set: Commit: ed30857c852f7cdb0a29e25a2ddb030f76f5c16b
            jhammond John Hammond added a comment - https://testing.hpdd.intel.com/test_sets/cdbfbfcc-1386-11e6-855a-5254006e85c2

            Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/20022
            Subject: LU-6844 tests: disable DNE testing of dbench
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 52eda32e8d21bc1a58ca7d96db6e10e0a325d216

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/20022 Subject: LU-6844 tests: disable DNE testing of dbench Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 52eda32e8d21bc1a58ca7d96db6e10e0a325d216
            emoly.liu Emoly Liu added a comment - Another on master: https://testing.hpdd.intel.com/test_sets/da1ade02-0df8-11e6-855a-5254006e85c2

            This is failing fairly frequently in testing. It would be good to make some progress with the debugging patch, or actual fix.

            adilger Andreas Dilger added a comment - This is failing fairly frequently in testing. It would be good to make some progress with the debugging patch, or actual fix.

            People

              di.wang Di Wang (Inactive)
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: