async update cross-MDTs (LU-3534)

[LU-3540] recovery for cross-MDT operation Created: 29/Jun/13  Updated: 22/Dec/15  Resolved: 01/Jul/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.8.0

Type: Technical task Priority: Minor
Reporter: Di Wang Assignee: Di Wang
Resolution: Fixed Votes: 0
Labels: None

Rank (Obsolete): 8904

 Description   

When one MDT restarts after a crash, it will process all of the records in its local update llog. It will batch up all of the updates with the same ur_master_index, ur_batchid and sends them in an OUT_UPDATE RPC to each of the remote targets that were part of the operation. At the mean time, all other MDTs will be notified, and they will also check their own local update log, and all of the related records will be sent to the failover MDT.

The MDT who receives the updates from other MDT, will check whether the corresponding updates are already recorded in their local update llog.
If the update was already committed, then the MDT will reply with an arbitrary pb_transno < pb_last_committed.

If the updates do not exist in the update llog, they will compare the master transno in the update record with the transno in the last_rcvd, if the transno in update record is smaller than the one in the last_rcvd, it means the master already sent the update to this MDT, and the update is already being exected and committed, and the update log has been deleted, so it will also return an arbitrary smaller transno as above. If the transno in the update record is larger, it will replay the update with a new transno.

In all of cases, the MDT will reply to the sender with the transno.
If the sender is the recovering MDT, which is the master for this operation, it will build the in-memory operation state to track the remote updates, and when all of the remote updates have committed, it can cancel the local update record.
Then client will send replay/resend request to the failover MDT,

The master MDT will check whether the request exists in the update log by the request xid.

If it does not exist, it will compare the request transno with its own transno, only replay the request if its transno is bigger than the last transno(lcd_last_transno) of this MDT.

If it does exist, it means the recovery between MDTs already handle this case. So it will return an arbitrary smaller transno, then client can remove the request from the replay list.
If there are any failures during the above 2 steps, lfsck daemon will be triggered to fix the filesystem.

For more details, please refer to the HLD for DNE phase II.



 Comments   
Comment by Gerrit Updater [ 05/Jun/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/11737/
Subject: LU-3540 lod: update recovery thread
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 4f53536d002c13886210b672b657795baa067144

Comment by James A Simmons [ 01/Jul/15 ]

Is their work left?

Comment by Di Wang [ 01/Jul/15 ]

Oh, this part has been landed.

Comment by Di Wang [ 01/Jul/15 ]

patches landed to master

Generated at Sat Feb 10 01:34:47 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.