Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6919

replay-single test_70b: "Cannot send after transport endpoint shutdown" running dbench

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.7.0
    • None
    • 3
    • 9223372036854775807

    Description

      The test was executed for 50 iterations out of that it failed for 4.

      fre0107: 1 6274 0.00 MB/sec execute 143 sec latency 198693.612 ms
      fre0107: [6274] open ./clients/client0/~dmtmp/WORDPRO/BENCHS.LWP failed for handle 11182 (Cannot send after transport endpoint shutdown)
      fre0107: [6277] open ./clients/client0/~dmtmp/WORDPRO/BENCHS.LWP failed for handle 11183 (Cannot send after transport endpoint shutdown)
      fre0107: (6278) ERROR: handle 11183 was not found
      fre0107: Child failed with status 1
      fre0107: status script Total(sec) E(xcluded) S(low)
      fre0107: ------------------------------------------------------------------------------------
      fre0107:
      fre0107: touch: missing file operand
      fre0107: Try `touch --help' for more information.
      pdsh@fre0107: fre0107: ssh exited with exit code 1
      fre0108: [6481] unlink ./clients/client0/~dmtmp/WORDPRO/BENCHS1.LWP failed (Cannot send after transport endpoint shutdown) - expected NT_STATUS_OK
      fre0108: ERROR: child 0 failed at line 6481
      fre0108: Child failed with status 1
      fre0108: status script Total(sec) E(xcluded) S(low)
      fre0108: ------------------------------------------------------------------------------------
      fre0108:
      fre0108: touch: missing file operand
      fre0108: Try `touch --help' for more information.
      pdsh@fre0107: fre0108: ssh exited with exit code 1
      fre0108: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 182 sec
      fre0107: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 182 sec
      fre0108: dbench: no process killed
      fre0107: dbench: no process killed
      pdsh@fre0107: fre0108: ssh exited with exit code 1
      pdsh@fre0107: fre0107: ssh exited with exit code 1
      replay-single test_70b: @@@@@@ FAIL: dbench stopped on some of fre0107,fre0108!
      Trace dump:
      = /usr/lib64/lustre/tests/test-framework.sh:4732:error_noexit()
      = /usr/lib64/lustre/tests/replay-single.sh:2080:test_70b()
      = /usr/lib64/lustre/tests/test-framework.sh:5010:run_one()
      = /usr/lib64/lustre/tests/test-framework.sh:5047:run_one_logged()
      = /usr/lib64/lustre/tests/test-framework.sh:4864:run_test()
      = /usr/lib64/lustre/tests/replay-single.sh:2101:main()
      Dumping lctl log to /tmp/test_logs/1437990212/replay-single.test_70b.*.1437990422.log
      fre0106: Warning: Permanently added 'fre0107,192.168.101.7' (RSA) to the list of known hosts.

      fre0108: Warning: Permanently added 'fre0107,192.168.101.7' (RSA) to the list of known hosts.

      fre0105: Warning: Permanently added 'fre0107,192.168.101.7' (RSA) to the list of known hosts.

      fre0107: dbench: no process killed
      fre0108: dbench: no process killed
      pdsh@fre0107: fre0107: ssh exited with exit code 1
      pdsh@fre0107: fre0108: ssh exited with exit code 1
      replay-single test_70b: @@@@@@ FAIL: rundbench load on fre0107,fre0108 failed!
      Trace dump:
      = /usr/lib64/lustre/tests/test-framework.sh:4732:error_noexit()
      = /usr/lib64/lustre/tests/test-framework.sh:4763:error()
      = /usr/lib64/lustre/tests/replay-single.sh:2099:test_70b()
      = /usr/lib64/lustre/tests/test-framework.sh:5010:run_one()
      = /usr/lib64/lustre/tests/test-framework.sh:5047:run_one_logged()
      = /usr/lib64/lustre/tests/test-framework.sh:4864:run_test()
      = /usr/lib64/lustre/tests/replay-single.sh:2101:main()
      Dumping lctl log to /tmp/test_logs/1437990212/replay-single.test_70b.*.1437990424.log
      FAIL 70b (208s)

      Attachments

        1. 70b__2.lctl.tgz
          1.39 MB
          Aditya Pandit
        2. LU-6919-client1.txt
          115 kB
          Aditya Pandit
        3. LU-6919-client2.txt
          120 kB
          Aditya Pandit
        4. LU-6919-MGS.txt
          147 kB
          Aditya Pandit
        5. LU-6919-OST1.txt
          161 kB
          Aditya Pandit

        Issue Links

          Activity

            People

              wc-triage WC Triage
              aditya.pandit@seagate.com Aditya Pandit (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: