Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.0.0, Lustre 2.1.0
-
None
-
3
-
23,175
-
5096
Description
+ su mpiuser sh -c "/opt/mpich/ch-p4/bin/mpirun -np 12 -machinefile /tmp/parallel-scale.machines
/usr/lib64/lustre/tests/write_disjoint -f /mnt/lustre/d0.write_disjoint/file -n 10000 "
[0] MPI Abort by user Aborting program !
loop 0: chunk_size 103399
loop 544: chunk_size 113838, file size was 1366056
rank 0, loop 545: invalid file size 528737 instead of 576804 = 48067 * 12
[0] Aborting program!
p4_error: latest msg from perror: Resource temporarily unavailable
Reproduced at Oracle and I also have seen similar failures locally.
Could be related to the LU-67 issue