[LU-19] imperative recovery Created: 22/Nov/10 Updated: 06/Jul/20 Resolved: 04/Jun/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.0.0, Lustre 1.8.6 |
| Fix Version/s: | Lustre 2.2.0 |
| Type: | New Feature | Priority: | Blocker |
| Reporter: | Jinshan Xiong (Inactive) | Assignee: | Jinshan Xiong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Sub-Tasks: |
|
||||||||||||||||||||||||||||||||||||||||
| Bugzilla ID: | 18,767 | ||||||||||||||||||||||||||||||||||||||||
| Rank (Obsolete): | 10457 | ||||||||||||||||||||||||||||||||||||||||
| Description |
|
Imperative recovery means the clients are notified explicitly when and where a failed target has |
| Comments |
| Comment by Robert Read (Inactive) [ 22/Nov/10 ] |
|
That's probably the original bug's description (which I wrote), but a fully automated version of this does not require the health network - that's just to help with scaling. The main thing needed will be a mechanism for a restarted target to request another target (such as the mgt) to notify all connected clients to reconnect to the target using the current nid. The client will obviously need to support this notification |
| Comment by Jinshan Xiong (Inactive) [ 22/Nov/10 ] |
|
in the 1st phase, only ost restarting problem will be addressed - I assume that mgs is always alive so that config lock can be used to notify clients whenever ost targets are failed. Maybe here comes a suggestion to always separate MGS and MDS so that MDS target case can be addressed as well. The basic idea would be as follows: wire protocol changes: |
| Comment by Robert Read (Inactive) [ 23/Nov/10 ] |
|
The design needs to support notifications for both OSTs and MDTs from the beginning, so we most likely can not rely the config lock or modify the MGS protocol. |
| Comment by Jinshan Xiong (Inactive) [ 23/Nov/10 ] |
|
Yes, this scheme addresses the problem of ost failure. In case mgs, which sets up on the same server as mds, fails, imperative recovery doesn't help since mgs itself will spend a lot of time to be reconnected by all clients. I'm thinking it over again and will come up with a better solution. |
| Comment by Jinshan Xiong (Inactive) [ 29/Nov/10 ] |
|
I think this over a little bit during the holiday. Now that we can't rely on MGS to notify the clients, we may have to build up a "healthy server ring" to help spread the event of restarting a target, and then use osc-ost, and mdc-mdt connection to notify clients. Please notice that we can always rely on MGS to detect faulting targets since even MGS is being restarted, the faulting target has to wait until the MGS revives. However, this means a target to target(or server to server) connection has to be introduced and I'm actually not in favor of doing this. Let's start it over. IMHO imperative recovery is a best-effort service, its purpose is to shorten the recovery window and it's acceptable to get back to `normal' recovery sometimes. By separating MGS and MDS and using MGS to notify clients, it can address the problem in a very simple way, and it mostly works since the # of OSTs in a cluster is much greater than the # of MGS and MDS, which further means the chance of OST failing is greater. WRT implementation, even changing the attribute of config lock or introducing new lock is actually not very important. How do you think, Robert? |
| Comment by Andreas Dilger [ 20/Jan/12 ] |
|
Found during IR testing |
| Comment by Jinshan Xiong (Inactive) [ 23/Jan/12 ] |
|
Summary of scalability tests: 1. we've got 125 client nodes, and then mount 500 mountpoints on each node to simulate the case of having ~60k client nodes; |
| Comment by James A Simmons [ 23/Jan/12 ] |
|
Should ORNL post its future IR testing results here? |
| Comment by Ian Colle (Inactive) [ 23/Jan/12 ] |
|
James - yes - that's a great idea. |
| Comment by James A Simmons [ 31/May/12 ] |
|
I think we can close this ticket now. |