Details
-
Bug
-
Resolution: Unresolved
-
Medium
-
None
-
Lustre 2.17.0
-
3
-
9223372036854775807
Description
This issue was created by maloo for jianyu <yujian@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/2958de6b-8eb7-4876-9f8c-b5110085dfe6
test_write_append_truncate failed with the following error:
== parallel-scale test write_append_truncate: write_append_truncate ========================================================== 07:18:02 (1764055082)
CMD: trevis-154vm220.trevis.whamcloud.com getent passwd mpiuser |
cut -d: -f3; exit \${PIPESTATUS[0]}
mpi_user=mpiuser MPI_USER_UID=532
CMD: trevis-154vm220.trevis.whamcloud.com getent passwd mpiuser |
cut -d: -f4; exit \${PIPESTATUS[0]}
mpi_user=mpiuser MPI_USER_GID=60000
OPTIONS:
clients=trevis-154vm220.trevis.whamcloud.com,trevis-154vm221
write_REP=10000
write_THREADS=8
MACHINEFILE=/tmp/auster.machines
trevis-154vm220.trevis.whamcloud.com
trevis-154vm221
stripe_count: 1 stripe_size: 4194304 pattern: 0 stripe_offset: -1
+ /usr/lib64/openmpi/bin/write_append_truncate -n 10000 -u /mnt/lustre/d0.write_append_truncate/f0.wat
+ chmod 0777 /mnt/lustre
drwxrwxrwx 5 root root 69632 Nov 25 07:18 /mnt/lustre
+ su mpiuser bash -c "/usr/lib64/openmpi/bin/mpirun --mca btl tcp,self --mca btl_tcp_if_include eth0 -mca boot ssh --oversubscribe -machinefile /tmp/auster.machines -np 16 /usr/lib64/openmpi/bin/write_append_truncate -n 10000 -u /mnt/lustre/d0.write_append_truncate/f0.wat "
r= 0: create /mnt/lustre/d0.write_append_truncate/f0.wat, max size: 3703701, seed 1764055083: ok
r= 0 l=0000: WR A 1188921/0x122439, AP a 685653/0x0a7655, TR@ 1393587/0x1543b3
r= 0 l=1000: WR M 697/0x0002b9, AP m 11991/0x002ed7, TR@ 1192115/0x1230b3
r= 0 l=2000: WR Y 873144/0x0d52b8, AP y 1028579/0x0fb1e3, TR@ 1561944/0x17d558
r= 0 l=3000: WR K 678444/0x0a5a2c, AP k 1180802/0x120482, TR@ 1805942/0x1b8e76
r= 0 l=4000: WR W 441860/0x06be04, AP w 154962/0x025d52, TR@ 1047681/0x0ffc81
r= 0 l=5000: WR I 722466/0x0b0622, AP i 3319/0x000cf7, TR@ 1070187/0x10546b
r= 0 l=5603: trunc-after-APPEND bad [252367-296578]/[0x3d9cf-0x48682] != n
r= 0 l=5603: WR N 252367/0x03d9cf, AP n 1012038/0x0f7146, TR@ 296579/0x048683
000000 N N N N N N N N N N N N N N N N
*
03d9c0 N N N N N N N N N N N N N N N nul
03d9d0 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
*
048680 nul nul nul
048683
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
Proc: [[26479,1],0]
Errorcode: 1
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
�[trevis-154vm220:00000] *** An error occurred in Socket closed
[trevis-154vm220:00000] *** reported by process [1735327745,12]
[trevis-154vm220:00000] *** on a NULL communicator
[trevis-154vm220:00000] *** Unknown error
[trevis-154vm220:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[trevis-154vm220:00000] *** and MPI will try to terminate your MPI job as well)
� parallel-scale test_write_append_truncate: @@@@@@ FAIL: write_append_truncate failed! 1
Test session details:
clients: https://build.whamcloud.com/job/lustre-master/4674 - 6.12.0-55.37.1.el10_0.x86_64
servers: https://build.whamcloud.com/job/lustre-master/4674 - 4.18.0-553.76.1.el8_lustre.x86_64
<<Please provide additional information about the failure here>>
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
parallel-scale test_write_append_truncate - write_append_truncate failed! 1