Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.1.4
-
None
-
Lustre Branch: b2_1
Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/148
Distro/Arch: RHEL5.8/x86_64 (kernel version: 2.6.18-308.20.1.el5)
Network: TCP (1GigE)
-
3
-
2,304
-
5793
Description
The parallel-scale test write_disjoint failed as follows:
== parallel-scale test write_disjoint: write_disjoint ================================================ 14:32:34 (1355005954) OPTIONS: WRITE_DISJOINT=/usr/lib64/lustre/tests/write_disjoint clients=fat-intel-3vm5,fat-intel-3vm6.lab.whamcloud.com wdisjoint_THREADS=4 wdisjoint_REP=10000 MACHINEFILE=/tmp/parallel-scale.machines fat-intel-3vm5 fat-intel-3vm6.lab.whamcloud.com + /usr/lib64/lustre/tests/write_disjoint -f /mnt/lustre/d0.write_disjoint/file -n 10000 + chmod 0777 /mnt/lustre drwxrwxrwx 5 root root 4096 Dec 8 14:32 /mnt/lustre + su mpiuser sh -c "/usr/lib64/openmpi/1.4-gcc/bin/mpirun -mca boot ssh -np 8 -machinefile /tmp/parallel-scale.machines /usr/lib64/lustre/tests/write_disjoint -f /mnt/lustre/d0.write_disjoint/file -n 10000 " -------------------------------------------------------------------------- [[22376,1],3]: A high-performance Open MPI point-to-point messaging module was unable to find any relevant network interfaces: Module: OpenFabrics (openib) Host: fat-intel-3vm6.lab.whamcloud.com Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- loop 0: chunk_size 103399 rank 3, loop 0: invalid file size 723793 instead of 827192 = 103399 * 8 -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 4 in communicator MPI_COMM_WORLD with errorcode -1. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- rank 4, loop 0: invalid file size 723793 instead of 827192 = 103399 * 8 rank 7, loop 0: invalid file size 723793 instead of 827192 = 103399 * 8 --------------------------------------------------------------------------
Maloo report: https://maloo.whamcloud.com/test_sets/bfc081dc-41bf-11e2-a653-52540035b04c
Attachments
Issue Links
- is duplicated by
-
LU-2452 parallel-scale test_write_append_truncate: trunc-after-APPEND bad [435936-441669]/[0x6a6e0-0x6bd45] != c
- Closed