Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.10.0
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for sarah_lw <wei3.liu@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/45d48942-2507-11e7-9de9-5254006e85c2.
The sub-test test_cascading_rw failed with the following error:
cascading_rw failed! 1
server/client: lustre-master #3558 ldiskfs el7
test log
+ su mpiuser sh -c "/usr/lib64/compat-openmpi16/bin/mpirun --mca btl tcp,self --mca btl_tcp_if_include eth0 -mca boot ssh -machinefile /tmp/parallel-scale.machines -np 4 /usr/lib64/lustre/tests/cascading_rw -g -d /mnt/lustre/d0.cascading_rw -n 300 " -------------------------------------------------------------------------- A deprecated MCA parameter value was specified in an MCA parameter file. Deprecated MCA parameters should be avoided; they may disappear in future releases. Deprecated parameter: plm_rsh_agent -------------------------------------------------------------------------- -------------------------------------------------------------------------- A deprecated MCA parameter value was specified in an MCA parameter file. Deprecated MCA parameters should be avoided; they may disappear in future releases. Deprecated parameter: plm_rsh_agent -------------------------------------------------------------------------- -------------------------------------------------------------------------- A deprecated MCA parameter value was specified in an MCA parameter file. Deprecated MCA parameters should be avoided; they may disappear in future releases. Deprecated parameter: plm_rsh_agent -------------------------------------------------------------------------- -------------------------------------------------------------------------- A deprecated MCA parameter value was specified in an MCA parameter file. Deprecated MCA parameters should be avoided; they may disappear in future releases. Deprecated parameter: plm_rsh_agent -------------------------------------------------------------------------- -------------------------------------------------------------------------- A deprecated MCA parameter value was specified in an MCA parameter file. Deprecated MCA parameters should be avoided; they may disappear in future releases. Deprecated parameter: plm_rsh_agent -------------------------------------------------------------------------- -------------------------------------------------------------------------- A deprecated MCA parameter value was specified in an MCA parameter file. Deprecated MCA parameters should be avoided; they may disappear in future releases. Deprecated parameter: plm_rsh_agent -------------------------------------------------------------------------- /usr/lib64/lustre/tests/cascading_rw is running with 4 process(es) in DEBUG mode 23:47:45: Running test #/usr/lib64/lustre/tests/cascading_rw(iter 0) [trevis-55vm1:21694] *** Process received signal *** [trevis-55vm1:21694] Signal: Floating point exception (8) [trevis-55vm1:21694] Signal code: Integer divide-by-zero (1) [trevis-55vm1:21694] Failing at address: 0x4024c8 [trevis-55vm1:21694] [ 0] /lib64/libpthread.so.0(+0xf370) [0x7fdf9fad6370] [trevis-55vm1:21694] [ 1] /usr/lib64/lustre/tests/cascading_rw() [0x4024c8] [trevis-55vm1:21694] [ 2] /usr/lib64/lustre/tests/cascading_rw() [0x402be0] [trevis-55vm1:21694] [ 3] /usr/lib64/lustre/tests/cascading_rw() [0x40158e] [trevis-55vm1:21694] [ 4] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7fdf9f727b35] [trevis-55vm1:21694] [ 5] /usr/lib64/lustre/tests/cascading_rw() [0x40169d] [trevis-55vm1:21694] *** End of error message *** [trevis-55vm1.trevis.hpdd.intel.com][[36239,1],2][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) [trevis-55vm2.trevis.hpdd.intel.com][[36239,1],1][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 21694 on node trevis-55vm1.trevis.hpdd.intel.com exited on signal 8 (Floating point exception). -------------------------------------------------------------------------- parallel-scale test_cascading_rw: @@@@@@ FAIL: cascading_rw failed! 1 Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:4905:error() = /usr/lib64/lustre/tests/functions.sh:734:run_cascading_rw() = /usr/lib64/lustre/tests/parallel-scale.sh:130:test_cascading_rw() = /usr/lib64/lustre/tests/test-framework.sh:5181:run_one() = /usr/lib64/lustre/tests/test-framework.sh:5220:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:5067:run_test() = /usr/lib64/lustre/tests/parallel-scale.sh:132:main()