[LU-15414] FLDB mirroring Created: 06/Jan/22  Updated: 12/Mar/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: LMR

Issue Links:
Related
is related to LU-4898 LFSCK 5: Need mechanism to recover FL... Open
is related to LU-15437 setparam to modify fldb Open
Rank (Obsolete): 9223372036854775807
Epic Link: LMR: Lustre Metadata Redundancy

 Description   

For reliability, it would be desirable to replicate the FLDB file across multiple MDTs, in case the FLDB file on MDT0000 is lost or corrupted. Since the FLDB itself is changing very rarely (only when new MDTs or OSTs are added to the filesystem, or 4B SEQ numbers have been allocated by one target), there should not be any noticeable performance overhad from having multiple mirrors.

Since the FLDB will almost always be in sync across MDTs, it would be possible for the clients/servers to contact any MDT with an FLDB replica, and only query the MDT0000 FLDB copy if the requested SEQ number could not be located on the other MDT.

When there are many MDTs, it may be impractical to have an FLDB copy on every MDT in the filesystem, so it makes sense to (deterministically) have FLDB copies only on a subseet of MDTs, such as having backups on MDT0001, MDT0003, MDT0005, MDT0007, MDT0009, MDTxxxx where x=3 n , 5 n , 7 n (like ext4 superblock copies). This would provide one backup for 2 MDTs, 2 backups for 4 MDTs, 3 for 8 MDTs, ..., up to 23 replicas with 65536 MDTs. One drawback of this mechanism is that the replicas may not be available if the MDTs are not densely numbered, but that is a very uncommon configuration, and almost certainly MDT0000 and MDT0001 should be available.



 Comments   
Comment by Andreas Dilger [ 21/Jan/22 ]

It may be enough that each MDT and OST always fetches a full copy of the FLDB from MDT0000 and stores it locally. That would make it trivial to manually copy the "fld" file to a failed MDT0000, or (better) do FLDB lookups on another server to recover at mount time from another server.

Generated at Sat Feb 10 03:18:06 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.