[LU-14433] fallocate: osc_extent_make_ready()) ASSERTION( last_oap_count > 0 ) failed Created: 15/Feb/21  Updated: 16/Jun/22  Resolved: 21/Apr/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Major
Reporter: Mikhail Pershin Assignee: Mikhail Pershin
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-14287 Add 'fallocate' to racer Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

After adding fallocate call to racer the following crash is observed very often:

[  156.954031] LustreError: 4652:0:(osc_cache.c:1141:osc_extent_make_ready()) ASSERTION( last_oap_count > 0 ) failed: last_oap_count 0
[  156.956535] LustreError: 4652:0:(osc_cache.c:1141:osc_extent_make_ready()) LBUG
[  156.957881] Pid: 4652, comm: ldlm_bl_01 3.10.0-7.9-debug #1 SMP Mon Feb 1 17:33:41 EST 2021
[  156.959725] Call Trace:
[  156.960375]  [<ffffffffa017873c>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[  156.962003]  [<ffffffffa0178a5c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[  156.963391]  [<ffffffffa0839b26>] osc_extent_make_ready+0xb66/0xe60 [osc]
[  156.965016]  [<ffffffffa083c383>] osc_io_unplug0+0xee3/0x1900 [osc]
[  156.966261]  [<ffffffffa0840a60>] osc_cache_writeback_range+0x9a0/0xfd0 [osc]
[  156.968027]  [<ffffffffa082b985>] osc_lock_flush+0x195/0x290 [osc]
[  156.969414]  [<ffffffffa082be58>] osc_ldlm_blocking_ast+0x2f8/0x3e0 [osc]
[  156.970970]  [<ffffffffa05bbe54>] ldlm_cancel_callback+0x84/0x320 [ptlrpc]
[  156.972552]  [<ffffffffa05d4011>] ldlm_cli_cancel_local+0xd1/0x420 [ptlrpc]
[  156.974515]  [<ffffffffa05da24c>] ldlm_cli_cancel+0x10c/0x560 [ptlrpc]
[  156.976004]  [<ffffffffa082bcda>] osc_ldlm_blocking_ast+0x17a/0x3e0 [osc]
[  156.977297]  [<ffffffffa05e6435>] ldlm_handle_bl_callback+0xc5/0x3e0 [ptlrpc]
[  156.978932]  [<ffffffffa05e6d0f>] ldlm_bl_thread_main+0x5bf/0xae0 [ptlrpc]
[  156.980417]  [<ffffffff810ba124>] kthread+0xe4/0xf0
[  156.981842]  [<ffffffff817eee5d>] ret_from_fork_nospec_begin+0x7/0x21
[  156.983639]  [<ffffffffffffffff>] 0xffffffffffffffff
[  156.984738] Kernel panic - not syncing: LBUG

This is result of wrong assumption about fallocate range over file range. It is done too early in ll_fallocate() so pending write/truncate may change file size and fallocate could change file size wrongly



 Comments   
Comment by Gerrit Updater [ 15/Feb/21 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41668
Subject: LU-14433 llite: do fallocate() size checks under lock
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9ad2c55f863755452e74a3e72a10886c6a978bcf

Comment by Gerrit Updater [ 21/Apr/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41668/
Subject: LU-14433 llite: do fallocate() size checks under lock
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: f23ac22c4c79750fed6b05ddbe460bfc9b0f0ea5

Comment by Peter Jones [ 21/Apr/21 ]

Landed for 2.15

Generated at Sat Feb 10 03:09:42 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.