Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6844

replay-single test 70b failure: 'rundbench load on * failed!'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.9.0
    • Lustre 2.8.0, Lustre 2.9.0
    • None
    • review-dne-part-2 test group
    • 3
    • 9223372036854775807

    Description

      replay-single test 70b fails on rename in review-dne-part-2 test sessions. Logs are at:

      2015-07-12 00:40:18 - https://testing.hpdd.intel.com/test_sets/4d9d6272-2877-11e5-8d7f-5254006e85c2
      2015-07-13 07:41:04 - https://testing.hpdd.intel.com/test_sets/21c48224-2977-11e5-a9c5-5254006e85c2
      2015-07-13 12:08:49 - https://testing.hpdd.intel.com/test_sets/7a85b58a-29aa-11e5-b07d-5254006e85c2

      Although this failures looks like LU-4439, the error message in the client test log is different:

      onyx-30vm5: [4766] rename ./clients/client0/~dmtmp/WORD/~WRD3497.TMP ./clients/client0/~dmtmp/WORD/TIPS.DOC failed (No such file or directory) - expected NT_STATUS_OK
      onyx-30vm5: ERROR: child 0 failed at line 4766
      onyx-30vm5: Child failed with status 1
      onyx-30vm5: status        script            Total(sec) E(xcluded) S(low) 
      onyx-30vm5: ------------------------------------------------------------------------------------
      onyx-30vm5: 
      onyx-30vm5: touch: missing file operand
      onyx-30vm5: Try `touch --help' for more information.
      onyx-30vm5: mdc.lustre-MDT0002-mdc-*.mds_server_uuid in FULL state after 4 sec
      onyx-30vm6: mdc.lustre-MDT0002-mdc-*.mds_server_uuid in FULL state after 4 sec
      onyx-30vm6:    1      4685     0.19 MB/sec  execute  82 sec  latency 22735.691 ms
      onyx-30vm6:    1      5047     0.23 MB/sec  execute  83 sec  latency 559.323 ms
      CMD: onyx-30vm5,onyx-30vm6.onyx.hpdd.intel.com killall -0 dbench
      onyx-30vm5: dbench: no process killed
       replay-single test_70b: @@@@@@ FAIL: dbench stopped on some of onyx-30vm5,onyx-30vm6.onyx.hpdd.intel.com! 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:4727:error_noexit()
        = /usr/lib64/lustre/tests/test-framework.sh:4758:error()
        = /usr/lib64/lustre/tests/replay-single.sh:2099:test_70b()
        = /usr/lib64/lustre/tests/test-framework.sh:5020:run_one()
        = /usr/lib64/lustre/tests/test-framework.sh:5057:run_one_logged()
        = /usr/lib64/lustre/tests/test-framework.sh:4907:run_test()
        = /usr/lib64/lustre/tests/replay-single.sh:2101:main()
      Dumping lctl log to /logdir/test_logs/2015-07-11/lustre-reviews-el6_6-x86_64--review-dne-part-2--1_5_1__33266__-70239005896920-232641/replay-single.test_70b.*.1436664129.log
      CMD: onyx-30vm3,onyx-30vm4,onyx-30vm5,onyx-30vm6.onyx.hpdd.intel.com,onyx-30vm7 /usr/sbin/lctl dk > /logdir/test_logs/2015-07-11/lustre-reviews-el6_6-x86_64--review-dne-part-2--1_5_1__33266__-70239005896920-232641/replay-single.test_70b.debug_log.\$(hostname -s).1436664129.log;
               dmesg > /logdir/test_logs/2015-07-11/lustre-reviews-el6_6-x86_64--review-dne-part-2--1_5_1__33266__-70239005896920-232641/replay-single.test_70b.dmesg.\$(hostname -s).1436664129.log
      

      Info required for matching: replay-single 70b

      Attachments

        Issue Links

          Activity

            People

              di.wang Di Wang
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: