[LU-15009] precreate should cleanup orphans upon error Created: 16/Sep/21  Updated: 27/Oct/22  Resolved: 23/Dec/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.9, Lustre 2.15.0

Type: Bug Priority: Minor
Reporter: Lai Siyao Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-5648 corrupt files contain extra data Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Sometimes OST object precreate succeeds, but the reply get lost or timed out, in this case MDT doesn't update the known last_id, and just retries. And then subsequent precreate will fail because the last_id known on MDT is less than that on OST. Instead, upon each failure, it should restart as a reconnect and cleanup orphans first, and then continue subsequent precreate.



 Comments   
Comment by Andreas Dilger [ 16/Sep/21 ]

In this case, the MDT should handle this by skipping OST objects below the LAST_ID value, rather than deleting OST objects back to the MDT last_id value. That would leave these objects available for recovery via LFSCK, if the problem was on the MDT (e.g. restored from backup), rather than deleting them.

Comment by Lai Siyao [ 16/Sep/21 ]

The description is updated, the issue is objects are created on OST, but MDT get error, IMO this is no different from a reconnect, and can be handled the same.

Comment by Gerrit Updater [ 18/Sep/21 ]

"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44984
Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: bb30ada9d8f086bc3c2f97450df0fbb68d35f746

Comment by Gerrit Updater [ 23/Dec/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44984/
Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 1711e26ae861c28829870c2433caf7ee232909cf

Comment by Peter Jones [ 23/Dec/21 ]

Landed for 2.15

Comment by Gerrit Updater [ 23/Dec/21 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45930
Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: ac5034013c97f0e458252b7a337d3f184bd065ba

Comment by Gerrit Updater [ 30/Jan/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45930/
Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: 62d4e1d5f1be33abd3ee9f58cc09d471e22ee7ad

Generated at Sat Feb 10 03:14:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.