Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.14.0, Lustre 2.12.5
-
None
-
3
-
9223372036854775807
Description
Two groups data corruption cases possible with a lustre, but both is addressed to the lock cancel without osc object assigned to lock.
This is possible for the DoM and for the Lock Ahead cases.
Lock Ahead bug have a partial fix - it's LU-11670/LUS-6747.
1) first bug is addressed to the situation when check_and_discard function can found lock without l_ast_data assigned, this block to discard a pages from page cache and leave as is.
Next lock cancel will found this lock and page discard is skipped due lack of the osc object assigned. Pages can be read from page cache in next time by ll_do_fast_read which relies an page flags and provide a data from data cache.
For the Lock Ahead case, it don't have a logs and other conformation - but it looks possible.
for the DoM cases - this is confirmed case.
second lock cancel trace
ldlm_bl_13-35551 [034] 164201.591130: funcgraph_entry: | ll_dom_lock_cancel() { ldlm_bl_13-35551 [034] 164201.591132: funcgraph_entry: | cl_env_get() { ldlm_bl_13-35551 [034] 164201.591132: funcgraph_entry: 0.054 us | _raw_read_lock(); ldlm_bl_13-35551 [034] 164201.591132: funcgraph_entry: 0.039 us | lu_env_refill(); ldlm_bl_13-35551 [034] 164201.591133: funcgraph_entry: 0.046 us | cl_env_init0(); ldlm_bl_13-35551 [034] 164201.591133: funcgraph_entry: 0.035 us | lu_context_enter(); ldlm_bl_13-35551 [034] 164201.591133: funcgraph_entry: 0.034 us | lu_context_enter(); ldlm_bl_13-35551 [034] 164201.591134: funcgraph_exit: 1.811 us | } ldlm_bl_13-35551 [034] 164201.591134: funcgraph_entry: | cl_object_flush() { ldlm_bl_13-35551 [034] 164201.591134: funcgraph_entry: | lov_object_flush() { ldlm_bl_13-35551 [034] 164201.591134: funcgraph_entry: 0.115 us | down_read(); ldlm_bl_13-35551 [034] 164201.591135: funcgraph_entry: | lov_flush_composite() { ldlm_bl_13-35551 [034] 164201.591135: funcgraph_entry: | cl_object_flush() { ldlm_bl_13-35551 [034] 164201.591135: funcgraph_entry: | mdc_object_flush() { ldlm_bl_13-35551 [034] 164201.591136: funcgraph_entry: | mdc_dlm_blocking_ast0() { ldlm_bl_13-35551 [034] 164201.591136: funcgraph_entry: | lock_res_and_lock() { ldlm_bl_13-35551 [034] 164201.591136: funcgraph_entry: 0.114 us | _raw_spin_lock(); ldlm_bl_13-35551 [034] 164201.591136: funcgraph_entry: 0.030 us | _raw_spin_lock(); ldlm_bl_13-35551 [034] 164201.591137: funcgraph_exit: 0.677 us | } ldlm_bl_13-35551 [034] 164201.591137: funcgraph_entry: 0.031 us | unlock_res_and_lock(); ldlm_bl_13-35551 [034] 164201.591137: funcgraph_exit: 1.363 us | } ldlm_bl_13-35551 [034] 164201.591137: funcgraph_exit: 1.674 us | } ldlm_bl_13-35551 [034] 164201.591137: funcgraph_exit: 2.207 us | } ldlm_bl_13-35551 [034] 164201.591138: funcgraph_exit: 2.596 us | } ldlm_bl_13-35551 [034] 164201.591138: funcgraph_entry: 0.042 us | up_read(); ldlm_bl_13-35551 [034] 164201.591138: funcgraph_exit: 3.714 us | } ldlm_bl_13-35551 [034] 164201.591138: funcgraph_exit: 4.279 us | } ldlm_bl_13-35551 [034] 164201.591138: funcgraph_entry: | cl_env_put() { ldlm_bl_13-35551 [034] 164201.591138: funcgraph_entry: 0.034 us | lu_context_exit(); ldlm_bl_13-35551 [034] 164201.591139: funcgraph_entry: 0.030 us | lu_context_exit(); ldlm_bl_13-35551 [034] 164201.591139: funcgraph_entry: 0.030 us | _raw_read_lock(); ldlm_bl_13-35551 [034] 164201.591139: funcgraph_exit: 0.990 us | } ldlm_bl_13-35551 [034] 164201.591140: funcgraph_exit: 8.253 us | }
easy to see - mdc_dlm_blocking_ast0 skipped at begin, it mean lock isn't granted or no l_ast_data aka osc object assigned. Data was obtained from page cache later.
<...>-40843 [000] 164229.430007: funcgraph_entry: | ll_do_fast_read() { <...>-40843 [000] 164229.430009: funcgraph_entry: | generic_file_read_iter() { <...>-40843 [000] 164229.430010: funcgraph_entry: 0.044 us | _cond_resched(); <...>-40843 [000] 164229.430010: funcgraph_entry: | pagecache_get_page() { <...>-40843 [000] 164229.430010: funcgraph_entry: 0.706 us | find_get_entry(); <...>-40843 [000] 164229.430011: funcgraph_exit: 1.078 us | } <...>-40843 [000] 164229.430012: funcgraph_entry: | mark_page_accessed() { <...>-40843 [000] 164229.430012: funcgraph_entry: 0.088 us | activate_page(); <...>-40843 [000] 164229.430012: funcgraph_entry: 0.143 us | workingset_activation(); <...>-40843 [000] 164229.430013: funcgraph_exit: 0.925 us | } <...>-40843 [000] 164229.430014: funcgraph_entry: 0.032 us | _cond_resched(); <...>-40843 [000] 164229.430014: funcgraph_entry: | pagecache_get_page() { <...>-40843 [000] 164229.430014: funcgraph_entry: 0.070 us | find_get_entry(); <...>-40843 [000] 164229.430014: funcgraph_exit: 0.401 us | } <...>-40843 [000] 164229.430015: funcgraph_entry: | mark_page_accessed() { <...>-40843 [000] 164229.430015: funcgraph_entry: 0.037 us | activate_page(); <...>-40843 [000] 164229.430015: funcgraph_entry: 0.039 us | workingset_activation(); <...>-40843 [000] 164229.430015: funcgraph_exit: 0.649 us | } ....
Short description - how it was hit.
getattr_by_name provide a "DoM" bit in response while client have an DoM lock already, but no io under this lock was exist.
2) DoM's read on open corruption.
Scenario is near to same as previously. Open provide a data which moved to the page cache very early, with Uptodata flag set. But osc object isn't assigned to the lock.
Data was read with ll_do_fast_read and no real IO + lock match in mdc_enqueue_send().
Lock was canceled without pages flush, but client continue to read a stale data via ll_do_fast_read.
...
Attachments
Issue Links
- is related to
-
LU-9479 sanity test 184d 244: don't instantiate PFL component when taking group lock
- Open
-
LU-13759 sanity-dom sanityn_test_20 fails with '1 page left in cache after lock cancel'
- Resolved
-
LU-14084 change 'lfs migrate' to use 'MIGRATION_NONBLOCK' by default
- Open
- is related to
-
LU-12681 Data corruption - due incorrect KMS with SEL files
- Resolved
-
LU-11670 Incorrect size when using lockahead
- Resolved
-
LU-13128 a race between glimpse and lock cancel is not handled correctly
- Resolved