[LU-2570] test: metadata-updates: bind: Address family not supported by protocol Created: 03/Jan/13  Updated: 10/Jan/13  Resolved: 10/Jan/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.3.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Jay Lan (Inactive) Assignee: Bob Glossman (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

Server: 2.1.3-1nasS, centos 6.3, 2.6.32_279.2.1.el6
Client: 2.3.0-2nasC, sles11sp2, 3.0.42_0.7.3
1 mds (S337), 2 oss (S361, S362), 2 clients (S331, S332)
git source at https://github.com/jlan/lustre-nas, branch nas-2.3.0, tag 2.3.0-2nasC.


Attachments: Text File metadata-updates.suite_log.service331.log    
Severity: 3
Rank (Obsolete): 6002

 Description   

The metadata-updates failed in Part 5 at
+ mpirun_rsh -ssh -np 1 -hostfile /var/acc-sm/metadata-updates.machines /usr/lib64/lustre/tests/write_disjoint -f /mnt/nbp0-1/d0.metadata-updates/f0.write_disjoint_file -n 1000
bind: Address family not supported by protocol

Complete test log is attached.



 Comments   
Comment by Peter Jones [ 03/Jan/13 ]

Bob

Can you please look into this one?

Thanks

Peter

Comment by Bob Glossman (Inactive) [ 03/Jan/13 ]

Jay, from the log you've posted the command that's failing is mpirun_rsh. That doesn't seem to be part of openmpi. I can't find it anywhere. Is this something specific to your environment? Is it a script that invokes mpirun with some options? Without knowing more about it I can't figure out what it's complaining about.

Comment by Jay Lan (Inactive) [ 03/Jan/13 ]

I will look into it tomorrow. I originally thought the "bind: Address family not supported by protocol" came from write_disjoint binary.

Comment by Jay Lan (Inactive) [ 04/Jan/13 ]

I started to suspect it was caused by running sles11sp2 kernel in an sles11sp1 environment. I do not have an sles11sp2 image to image my test system, so I installed sles11sp2 kernel in an sles11sp1 environment. This maybe a compatibility issue between the kernel and run-time libraries.

I will post when I have more information.

Comment by Jay Lan (Inactive) [ 04/Jan/13 ]

The same problem still occured in a clean sles11sp2 environment.

I will look into the mpirun_rsh next.

Comment by Jay Lan (Inactive) [ 10/Jan/13 ]

The problem indeed was caused by our version of mpirun_rsh. Please close this ticket.

Comment by Bob Glossman (Inactive) [ 10/Jan/13 ]

Closing due to customer request. Problem was due to a feature of the customer's environment.

Generated at Sat Feb 10 01:26:19 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.