[LU-8750] Wrong obd_timeout on the client when we have 2 or more lustre fs Created: 24/Oct/16 Updated: 18/Apr/23 Resolved: 29/Mar/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Antoine Percher | Assignee: | Hongchao Zhang |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Epic/Theme: | lnet | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
when we mount 2 or more lustre fs on a client, the obd_timeout is the max of the all server obd_timeout. in some cases, it could be have some server evict due to that one of server does not wait obd_ping request enough time in my case, I have 2 lustre fs, Servers 2.5.X and some Clients 2.7, the first server have obd_timeout=100 and the second server have obd_timeout=300 so the obd_timeout inherited on the client is obd_timeout=300. the client send one obd-ping request each 75 seconds if just one obd_ping request is lost, the client could be evict, so it could be better to have a obd_timeout by filesystems or the min of the each servers filesystems |
| Comments |
| Comment by Andreas Dilger [ 25/Oct/16 ] |
|
I agree that this is a potential issue, and having a single global obd_timeout value is something that doesn't align with configurations where e.g. one filesystem is local and another is remote, and they should really have different timeout values. There are a few options that can be tried to resolve this problem without needing to wait for a patch and new release: There are also potential code fixes for this problem, in particular we discussed to add a per-target ping_interval tunable in /proc, similar to max_rpcs_in_flight and max_pages_per_rpc that allows setting the ping interval for a single filesystem explicitly. |
| Comment by Joseph Gmitter (Inactive) [ 25/Oct/16 ] |
|
Hi Hongchao, Can you please look into the suggested code fixes that Andreas has highlighted in the last comment? Thanks. |
| Comment by Hongchao Zhang [ 28/Oct/16 ] |
|
test output: After setting timeout of FS100 to 300 explicitly, the timeout will be changed to 300 lctl conf_param FS100.sys.timeout=300 2) mount client with timeout 100 first, mount client with timeout 300 second After setting timeout of FS300 to 100 explicitly, the timeout will be changed to 100 lctl conf_param FS300.sys.timeout=100 |
| Comment by Andreas Dilger [ 12/Apr/18 ] |
|
To properly fix this problem, it would be good to store the ping_interval and obd_timeout on a per-import basis. That would allow a single client to mount two or more different filesystems with different server timeouts (which the client can't control). |
| Comment by Andreas Dilger [ 16/Aug/18 ] |
|
With the newer userspace-driven parameter parsing (an upcall via udev to lctl) it may be easier to implement per-OBD timeouts relatively easily. By default, new OBD devices will inherit the global timeout value when they are created (stored in each obd_device or obd_export separately, and always used from the local device instead of the global value). If there is a timeout parameter in the configuration logs (which would normally generate an "lctl set_param timeout=<value>" upcall), this would be replaced by "*.<fsname>-*.timeout" so that the upcall for that filesystem's configuration log will only change the devices for the named filesystem. |
| Comment by Andreas Dilger [ 29/Mar/23 ] |
|
Closing this as a duplicate of LU-9912, I've copied CC's over already. |