[LU-8750] Wrong obd_timeout on the client when we have 2 or more lustre fs - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Duplicate
Priority: Minor
Fix Version/s: None
Affects Version/s: Lustre 2.7.0
Labels:
None

Epic/Theme:
- lnet
Rank (Obsolete):
9223372036854775807

Description

when we mount 2 or more lustre fs on a client, the obd_timeout is the max of the all server obd_timeout. in some cases, it could be have some server evict due to that one of server does not wait obd_ping request enough time

in my case, I have 2 lustre fs, Servers 2.5.X and some Clients 2.7, the first server have obd_timeout=100 and the second server have obd_timeout=300 so the obd_timeout inherited on the client is obd_timeout=300. the client send one obd-ping request each 75 seconds if just one obd_ping request is lost, the client could be evict, so it could be better to have a obd_timeout by filesystems or the min of the each servers filesystems

Attachments

Issue Links

is related to

LU-16749 apply a filter to the configuration log.

Open

is related to

LU-9912 fix multiple client mounts with different server timeouts

Open

LU-8066 Move lustre procfs handling to sysfs and debugfs.

Open

LU-15246 Add per device adaptive timeout parameters

Resolved

Activity

People

Assignee:: Hongchao Zhang

Reporter:: Antoine Percher

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 24/Oct/16 9:39 AM

Updated:: 18/Apr/23 5:24 PM

Resolved:: 29/Mar/23 10:32 PM