[LU-1065] High rate of obd_ping failure with client <-> OST evictions Created: 01/Feb/12  Updated: 13/Apr/12  Resolved: 13/Apr/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.x (1.8.0 - 1.8.5)
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Mahmoud Hanafi Assignee: Hongchao Zhang
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Clients: Lustre Version: 1.8.6.81
Servers: 2.1.0 with ofed 1.5.3.1


Attachments: File r8610n14.lustre     File service151.lustre.feb1    
Issue Links:
Duplicate
duplicates LU-874 Client eviction on lock callback time... Resolved
Severity: 3
Rank (Obsolete): 6470

 Description   

We are seeing a large amount of obd_ping failures then client eviction from our 2.1 servers relative to our 1.8.6/5 servers.
We have one 2.1 filesystem and six 1.8.x filesystems.
Here you can see obd_ping counts for each data:
---- lustre-20120125 -----
nbp5 1022
allothers 24
---- lustre-20120126 -----
nbp5 760
allothers 33
---- lustre-20120127 -----
nbp5 420
allothers 6
---- lustre-20120128 -----
nbp5 226
allothers 97
---- lustre-20120129 -----
nbp5 36
allothers 7
---- lustre-20120130 -----
nbp5 243
allothers 19
---- lustre-20120131 -----
nbp5 808
allothers 17
---- lustre-20120201 -----
nbp5 81
allothers 2

Attached are typical client log(r8610n14.lustre)
and server logs(service151.lustre.feb1)



 Comments   
Comment by Andreas Dilger [ 01/Feb/12 ]

This issue is already being tracked under LU-874, which has a number of patches scheduled to land for the 2.1.1 release.

Comment by Peter Jones [ 01/Feb/12 ]

Hi Hongchao

Could you please look into this one?

Thanks

Peter

Comment by Hongchao Zhang [ 06/Feb/12 ]

Hi Mahmoud

have you tested it with the patch in LU-874? what is the result?
Thanks

Comment by Peter Jones [ 13/Apr/12 ]

Assumed to be a duplicate of LU-874. We will reopen if this proves to not be the case

Generated at Sat Feb 10 01:13:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.