[LU-3875] Lustre client hanging due to :client_bulk_callback() and :ptlrpc_expire_one_request from OST to MDS nid time out Created: 04/Sep/13  Updated: 09/Jan/20  Resolved: 09/Jan/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.3
Fix Version/s: None

Type: Bug Priority: Major
Reporter: hithesh kumar Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Attachments: File client-messges     File mds1-messges     File mds2-messges    
Severity: 4
Rank (Obsolete): 10056

 Comments   
Comment by hithesh kumar [ 04/Sep/13 ]

The mds1 is the MDS server for MDT. mds2 is the failover node for mds1 and also serves around 7 OST's for which mds1 is the failover node. we are facing lustre mount hang problem for which i have attached client /var/log/messages of client, mds2(acts as OSS for 7 ost and MDT failover node). Lustre 2.1.3 has been installed on all MDS and OSS servers, Lustre patchless client is installed.

Comment by hithesh kumar [ 04/Sep/13 ]

173.16.1.50 is mds1, 173.16.1.51 is mds2, 173.16.1.52 is oss1, 173.16.1.53 is oss2 servers, we have around 34 client machine whose ip ranges from 173.16.1.220 to 173.16.1.254. Native Centos 6.1 infiniband rpms are installed.

Comment by Andreas Dilger [ 09/Jan/20 ]

Close old bug

Generated at Sat Feb 10 01:37:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.