[LU-17177] prevent DoM read-on-open with FLR Created: 10/Oct/23  Updated: 19/Oct/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Mikhail Pershin Assignee: Mikhail Pershin
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-16468 some miscellaneous IOs need to protec... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

DoM read-on-open feature fetch some data from MDT stripe during file open. Meanwhile that is happening always when it sees data on MDT without regards to actual mirror states. That can cause at least useless reads from non-active mirror, at worse stale data read.

Considering that the following changes are proposed:

  • no read-on-open data prefetch for FLR file
  • no DoM lock/size returned if DoM component has stale FLR flag

thus we avoid potential problems and doesn't lose significant benefits, in most cases mirrored file is not having any real performance gain from read-on-open.



 Comments   
Comment by Gerrit Updater [ 10/Oct/23 ]

"Mikhail Pershin <mpershin@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52613
Subject: LU-17177 mdt: prevent read-on-open for FLR file
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 350b8904fb3504697944db82e00380c26397caba

Comment by Mikhail Pershin [ 19/Oct/23 ]

During patch review Andreas has mentioned:

My preference is to avoid disabling this feature/optimization to handle a rare use case. I don't think it matters which mirror the pages are attached to, if the source is not stale.
Definitely the DoM data should not be returned if that component is marked stale, but otherwise it should be returned to the client and attached to any mirror.

That is also possible way to go, but it has to avoid LU-16468 issue which happenned due to missing lov_io_mirror_init() I think. So far I see two possibilities:

  • remove ci_ignore_layout in ll_dom_finish_open() for CIT_MISC IO, thus lov_io_mirror_init() will be not skipped in lov_io_slice_init()
  • keep that flag but do lov_io_mirror_init() always, with and without that flag

Initially that ci_ignore_layout was used to prevent layout_refresh() call while caller is under live layout lock, so it can reveal potential deadlock. That doesn't look so from code, it should match existing lock and proceed but still that flag removal need extra tests, racer probably. On other hand, including lov_io_mirror_init() to be called with 'ignore_layout'  flag can be not safe too, when MISC IO is started from OSC

Generated at Sat Feb 10 03:33:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.