Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1360

Test failure on test suite parallel-scale-nfsv3, subtest test_metabench

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Blocker
    • None
    • Lustre 2.1.2, Lustre 2.1.3, Lustre 2.1.4, Lustre 2.1.5, Lustre 2.1.6
    • 3
    • 4036

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/b019eb0a-929d-11e1-9e8b-525400d2bfa6.

      The sub-test test_metabench failed with the following error:

      metabench failed! 1

      == parallel-scale-nfsv3 test metabench: metabench ==================================================== 18:11:14 (1335748274)
      OPTIONS:
      METABENCH=/usr/bin/metabench
      clients=iu-3vm1.lab.whamcloud.com,iu-3vm2
      mbench_NFILES=30400
      mbench_THREADS=4
      iu-3vm1.lab.whamcloud.com
      iu-3vm2
      + /usr/bin/metabench -w /mnt/lustre/d0.metabench -c 30400 -C -S -k
      + chmod 0777 /mnt/lustre
      drwxrwxrwx 4 root root 4096 Apr 29 18:11 /mnt/lustre
      + su mpiuser sh -c "/usr/lib/openmpi/1.4-gcc/bin/mpirun -mca boot ssh -mca btl tcp,self -np 8 -machinefile /tmp/parallel-scale-nfsv3.machines /usr/bin/metabench -w /mnt/lustre/d0.metabench -c 30400 -C -S -k "
      Metadata Test <no-name> on 04/29/2012 at 18:11:19

      Rank 0 process on node iu-3vm1.lab.whamcloud.com
      Rank 1 process on node iu-3vm2.lab.whamcloud.com
      Rank 2 process on node iu-3vm1.lab.whamcloud.com
      Rank 3 process on node iu-3vm2.lab.whamcloud.com
      Rank 4 process on node iu-3vm1.lab.whamcloud.com
      Rank 5 process on node iu-3vm2.lab.whamcloud.com
      Rank 6 process on node iu-3vm1.lab.whamcloud.com
      Rank 7 process on node iu-3vm2.lab.whamcloud.com

      [04/29/2012 18:11:19] FATAL error on process 0
      Proc 0: Cant stat [d0.metabench]: Value too large for defined data type
      --------------------------------------------------------------------------
      mpirun has exited due to process rank 0 with PID 2161 on
      node iu-3vm1.lab.whamcloud.com exiting without calling "finalize". This may
      have caused other processes in the application to be
      terminated by signals sent by mpirun (as reported here).
      --------------------------------------------------------------------------
      [iu-3vm2.lab.whamcloud.com][[3254,1],1][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] [iu-3vm1.lab.whamcloud.com][[3254,1],2][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
      [iu-3vm1.lab.whamcloud.com][[3254,1],4][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
      mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
      parallel-scale-nfsv3 test_metabench: @@@@@@ FAIL: metabench failed! 1
      Dumping lctl log to /logdir/test_logs/2012-04-28/lustre-b2_1-el5-x86_64-el5-i686_51_-7ff324267018/parallel-scale-nfsv3.test_metabench.*.1335748279.log

      Attachments

        Issue Links

          Activity

            People

              bogl Bob Glossman (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: