Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3785

lnet-selftest: test_smoke hung at "6 batch in stopping"

Details

    • 3
    • 9788

    Description

      lnet-selftest test_smoke hung as follows:

      Failed to stat on 6 nodes
      [LNet Rates of c]
      [R] Avg: 191      RPC/s Min: 183      RPC/s Max: 199      RPC/s
      [W] Avg: 196      RPC/s Min: 181      RPC/s Max: 211      RPC/s
      [LNet Bandwidth of c]
      [R] Avg: 18.91    MB/s  Min: 17.58    MB/s  Max: 20.24    MB/s
      [W] Avg: 11.84    MB/s  Min: 10.29    MB/s  Max: 13.39    MB/s
      killing 24261 ...
      /tmp/smoke.sh: line 87: 24261 Killed                  /usr/sbin/lst stat --delay 10 --timeout 10 c s
      c:
      Total 0 error nodes in c
      RPC failure, can't show error on 12345-10.10.17.58@tcp
      RPC failure, can't show error on 12345-10.10.17.59@tcp
      RPC failure, can't show error on 12345-10.10.17.60@tcp
      RPC failure, can't show error on 12345-10.10.17.61@tcp
      RPC failure, can't show error on 12345-10.10.17.64@tcp
      RPC failure, can't show error on 12345-10.10.17.65@tcp
      s:
      Total 6 error nodes in s
      8 batch in stopping
      7 batch in stopping
      6 batch in stopping
      

      Maloo report: https://maloo.whamcloud.com/test_sets/01bbbfdc-092c-11e3-a9b0-52540035b04c

      Attachments

        Activity

          [LU-3785] lnet-selftest: test_smoke hung at "6 batch in stopping"

          Close old bug, not seen since 2.5.

          adilger Andreas Dilger added a comment - Close old bug, not seen since 2.5.
          yujian Jian Yu added a comment -

          Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/5/
          Distro/Arch: RHEL6.4/x86_64(server), SLES11SP3/x86_64(client)
          MDSCOUNT=1

          The same failure occurred:
          https://maloo.whamcloud.com/test_sets/ea6de240-7642-11e3-b3c0-52540035b04c

          yujian Jian Yu added a comment - Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/5/ Distro/Arch: RHEL6.4/x86_64(server), SLES11SP3/x86_64(client) MDSCOUNT=1 The same failure occurred: https://maloo.whamcloud.com/test_sets/ea6de240-7642-11e3-b3c0-52540035b04c
          yujian Jian Yu added a comment -

          Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/59/
          Distro/Arch: RHEL6.4/x86_64 (server), SLES11SP2/x86_64 (client)
          MDSCOUNT=1

          The same failure occurred:
          https://maloo.whamcloud.com/test_sets/6d5cdb92-5803-11e3-b1ae-52540035b04c

          yujian Jian Yu added a comment - Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/59/ Distro/Arch: RHEL6.4/x86_64 (server), SLES11SP2/x86_64 (client) MDSCOUNT=1 The same failure occurred: https://maloo.whamcloud.com/test_sets/6d5cdb92-5803-11e3-b1ae-52540035b04c
          yujian Jian Yu added a comment - Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/44/ (2.4.1 RC1) MDSCOUNT=4 lnet-selftest failed again: https://maloo.whamcloud.com/test_sets/ae4a21b6-1657-11e3-aa2a-52540035b04c
          yujian Jian Yu added a comment - Another instance with DNE configuration: https://maloo.whamcloud.com/test_sets/d75f8c58-0284-11e3-b384-52540035b04c

          People

            wc-triage WC Triage
            yujian Jian Yu
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: