[LU-14512] prohibit extend file with stale mirror Created: 11/Mar/21 Updated: 13/Mar/21 Resolved: 13/Mar/21 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Zhenyu Xu | Assignee: | Zhenyu Xu |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
Otherwise a FLR file could be created which contains multiple updated mirrors but their contents do not match. |
| Comments |
| Comment by Alex Zhuravlev [ 11/Mar/21 ] |
|
how this can be? my understanding is that extending itself can't introduce any inconsistency as it copies data from replica known to be correct? |
| Comment by Gerrit Updater [ 11/Mar/21 ] |
|
Bobi Jam (bobijam@hotmail.com) uploaded a new patch: https://review.whamcloud.com/42008 |
| Comment by Zhenyu Xu [ 11/Mar/21 ] |
|
IIRC, mirror extend does not do the resync, only merge the layout and bring its data, or extend to create a new mirror with empty data. |
| Comment by Alex Zhuravlev [ 11/Mar/21 ] |
well, no, lfs mirror extend -N creates a mirror and (re)sync it |
| Comment by Zhenyu Xu [ 11/Mar/21 ] |
|
you are right. There's another issue, as the file contains updated mirror1 and stale mirror2 (in write-pending state), it extends a mirror3 with data migrated from mirror1, lod_declare_update_write_pending() would fail the assertion that a FLR in write shouldn't contain multiple primary mirrors. |
| Comment by Alex Zhuravlev [ 11/Mar/21 ] |
|
probably https://review.whamcloud.com/#/c/42003/4 can help with that? |
| Comment by Zhenyu Xu [ 11/Mar/21 ] |
|
yes that can help. Is it a good idea to prohibit un-synced FLR file to extend new mirror in general? |
| Comment by Alex Zhuravlev [ 11/Mar/21 ] |
|
I'm not sure. probably someone else can comment on this rather practical question? |
| Comment by Andreas Dilger [ 13/Mar/21 ] |
No, I don't think that is a good idea to impose that limitation. Consider a file with mirror1 (primary), mirror2 (stale). If mirror2 is on a failed OST, we may want to create a whole new mirror3 to restore redundancy, and then delete mirror2. It should always be possible to create a new mirror as long as there is a non-stale mirror. |
| Comment by Andreas Dilger [ 13/Mar/21 ] |
|
In the FLR case, the stale mirror may be an older version of the file (LU-12648) that is still useful, or may still have good data at the start of a file (common case of appending write) that could be manually recovered. Note that it is better to add the new mirror to the file before removing the stale mirror, in case there are also some non-overlapping errors in the remaining (non-stale) mirror. This often happens in RAID arrays that a drive partly fails, and then during resync there are also sector errors on the "good" drive, and if the bad drive is still partly available the few bad sectors can still be recovered. |