[LU-10191] FLR2: Server Local Client (SLC) Created: 02/Nov/17 Updated: 20/Oct/20 |
|
| Status: | Reopened |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Minor |
| Reporter: | Andreas Dilger | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | FLR2 | ||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||||||
| Description |
|
In order to mount a client locally on the OSS or MDS without affecting the recovery of local targets, we need the ability to mount without inserting the client into the last_rcvd file. That avoids the problem when a client+server crashes and the local client UUID is no longer available for the recovery, causing recovery to always take the maximum time. Any modifying RPCs to the local OST should be synchronous by default, or possibly use commit-on-share, so that they do not need to be replayed if the server restarts. This implies that it is more desirable to schedule lfs mirror resync in such a way that it is reading from the local OSS and writing to a remote OSS. It might be desirable to allow this functionality to be disabled for testing purposes (e.g. local client mount in test scripts), or if local performance is more important than waiting for recovery to time out. It should be possible to enable this mode automatically at mount time based on the client NID, rather than having e.g. a mount option force a "local mount", since it would only apply to targets that are on the same OSS/MDS and not remote targets. A further optimization would avoid read caching data in the llite layer to avoid double cache of the same data, since the OSS would also cache the same data, and the OSS cache has the advantage that it could be shared with other clients. As a final stage, having a local llite<->obdfilter IO path that avoids data copies and LNet would potentially speed up IO performance and reduce local IO CPU usage significantly. It might be possible to implement this initially only for bulk IO, since that would typically have the highest memory copy overhead, and leave the locking/metadata to use the normal RPC paths, so that they are treated consistently (possibly avoiding hard-to-find bugs). |
| Comments |
| Comment by Andreas Dilger [ 31/May/18 ] |
|
Note that we can use the llite.*.client_type file to indicate that this is a local_server or similar. For better or worse, the current content is local client (and used to contain remote client for ancient LL_SBI_RMT_CLIENT mounts before patch v2_8_54_0-73-g9d06de3 was landed. There are sanity.sh test_125 and test_126 that check for local client mounts, but those checks could potentially just be removed. |
| Comment by Patrick Farrell (Inactive) [ 04/Sep/19 ] |
|
Local client exclusion from recovery is being done by bzzz underĀ |
| Comment by Alex Zhuravlev [ 16/Apr/20 ] |
|
implemented in |
| Comment by Andreas Dilger [ 16/Apr/20 ] |
|
I don't think this is fixed by This ticket is more about having a direct transfer of data from the local client mount to the local storage (probably OSC->OFD?) rather than doing memcpy() of the bulk data in the 0@lo interface. |