[LU-16160] take ldlm lock when queue sync pages Created: 15/Sep/22 Updated: 19/Jan/24 Resolved: 14/Mar/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0, Lustre 2.15.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Zhenyu Xu | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||
| Description |
|
osc_queue_sync_pages() add osc_extent to osc_object's IO extent list without taking ldlm locks, and then it calls osc_io_unplug_async() to queue the IO work for the client. I think the IO extent should take ldlm locks while waiting in the IO work queue. |
| Comments |
| Comment by Gerrit Updater [ 15/Sep/22 ] |
|
"Bobi Jam <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/48557 |
| Comment by Gerrit Updater [ 20/Sep/22 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/48607 |
| Comment by Gerrit Updater [ 24/Sep/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48557/ |
| Comment by Alexey Lyashkov [ 27/Sep/22 ] |
|
Just for record. Based on current logs - large time between lock cancel and next read. So high likely old race (from |
| Comment by Gerrit Updater [ 27/Sep/22 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/48673 |
| Comment by Gerrit Updater [ 08/Oct/22 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/48804 |
| Comment by Gerrit Updater [ 08/Oct/22 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/48805 |
| Comment by Alexey Lyashkov [ 26/Oct/22 ] |
|
I replicated a situation when page live in cache with uptodate flag and without cl_page. |
| Comment by Gerrit Updater [ 02/Nov/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48607/ |
| Comment by Peter Jones [ 02/Nov/22 ] |
|
All existing patches have landed. Please reopen if more work is to be tracked under this ticket. |
| Comment by Gerrit Updater [ 12/Dec/22 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49372 |
| Comment by Gerrit Updater [ 30/Dec/22 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49534 |
| Comment by Gerrit Updater [ 03/Jan/23 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49541 |
| Comment by Cory Spitz [ 03/Jan/23 ] |
|
> Please reopen if more work is to be tracked under this ticket |
| Comment by Gerrit Updater [ 04/Jan/23 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49549 |
| Comment by Gerrit Updater [ 04/Jan/23 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49553 |
| Comment by Andrew Perepechko [ 12/Jan/23 ] |
|
I've uploaded the workaround for stale data+SIGBUS issue that we are currently using, as an attachment. We're going to replace it with a better (design-wise/performance-wise) solution if/when we have it. Not sure if it's useful but we'd like to share. |
| Comment by Peter Jones [ 12/Jan/23 ] |
|
Thanks very much Andrew. Is it possible to push the patch into gerrit so it is easier for us to provide testing/review feedback? |
| Comment by Peter Jones [ 16/Jan/23 ] |
|
Andrew Bobijam feels that your attached patch takes a similar approach to his latest LU-16160 patch. Could you please review the latter in gerrit to flag any issues that should deter us from landing this to master? Thanks Peter |
| Comment by Andrew Perepechko [ 16/Jan/23 ] |
|
Peter, sure, I'll do that. Thank you |
| Comment by Gerrit Updater [ 16/Jan/23 ] |
|
"Patrick Farrell <farr0186@gmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49647 |
| Comment by Peter Jones [ 16/Jan/23 ] |
|
panda Patrick ported your attached patch to master and pushed it into gerrit, so we can compare and contrast both similar approaches. To that end, please can you confirm that nothing was altered during the porting? Thanks! |
| Comment by Gerrit Updater [ 19/Jan/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49541/ |
| Comment by Gerrit Updater [ 27/Jan/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49647/ |
| Comment by Gerrit Updater [ 03/Mar/23 ] |
|
"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50202 |
| Comment by Peter Jones [ 14/Mar/23 ] |
|
Looks like everything tracked under this ticket has landed for 2.16 |
| Comment by Gerrit Updater [ 11/Apr/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50202/ |