[LU-713] performance-sanity fail to start mpi tests Created: 23/Sep/11  Updated: 04/Nov/12  Resolved: 04/Nov/12

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Minh Diep Assignee: Chris Gearing (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Lustre Clients:
Tag: v2_1_0_0_RC2
Distro/Arch: Sles11/x86_64 (kernel version: 2.6.32.43-0.4-default)
Build: https://build.whamcloud.com/job/lustre-master/arch=x86_64,build_type=client,distro=sles11,ib_stack=inkernel/
Network: TCP
ENABLE_QUOTA=yes

Lustre Servers:
Tag: v2_1_0_0_RC2
Distro/Arch: RHEL6/x86_64 (kernel version: 2.6.32-131.6.1.el6)
Build: https://build.whamcloud.com/job/lustre-master/283/arch=x86_64,build_type=cserver,distro=el6,ib_stack=inkernel/
Network: TCP


Issue Links:
Duplicate
Severity: 3
Rank (Obsolete): 6553

 Description   

Report: https://maloo.whamcloud.com/test_sets/6e29d8a4-e58c-11e0-9909-52540025f9af

===== mdsrate-create-small.sh ### 1 NODE CREATE ###
+ /usr/lib64/lustre/tests/mdsrate --create --time 600
--nfiles 447838 --dir /mnt/lustre/mdsrate/single --filefmt 'f%%d'
+ chmod 0777 /mnt/lustre
drwxrwxrwx 4 root root 4096 Sep 22 17:51 /mnt/lustre
+ su mpiuser sh -c "/usr/lib64/mpi/gcc/openmpi/bin/mpirun -mca boot ssh -mca btl tcp,self -np 1 -machinefile /tmp/mdsrate-create-small.machines /usr/lib64/lustre/tests/mdsrate --create --time 600 --nfiles 447838 --dir /mnt/lustre/mdsrate/single --filefmt 'f%%d' "
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_paffinity_base_select failed
--> Returned value -13 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[client-22vm1:27504] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 77
[client-22vm1:27504] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file orterun.c at line 543
log: + chmod 0777 /mnt/lustre



 Comments   
Comment by Sarah Liu [ 05/Mar/12 ]

Got the same issue in tag 2.1.56 testing,so open it again:
https://maloo.whamcloud.com/test_sets/2f977604-627e-11e1-b462-5254004bbbd3

Comment by Minh Diep [ 05/Mar/12 ]

Sarah, check to see if this is a dup of TT-279

Comment by Sarah Liu [ 04/Nov/12 ]

dup of TT-786

Generated at Sat Feb 10 01:09:42 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.