[LU-16412] check truncated page in ->read page() Created: 19/Dec/22 Updated: 07/Jul/23 Resolved: 03/Feb/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0, Lustre 2.15.3 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Qian Yingjin | Assignee: | Qian Yingjin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
I found the page end offset calculation in filemap_get_read_batch() was off by one in 5.x kernel. When a read is submitted with end offset 1048575, then it incorrectly calculates In some corner racer case, filemap_get_read_batch() batches the page with index 1024 for read, but later this page is truncated and removed from page cache due to the lock protected it being revoked. This results in this page in the read path is not covered by a DLM lock. This will trigger an assertion in the code: LustreError: 14129:0:(osc_object.c:397:osc_req_attr_set()) uncovered page! Pid: 14129, comm: ptlrpcd_04_18 5.14.0-1038-oem #42-Ubuntu SMP Thu May 19 05:03:08 UTC 2022 LustreError: 14129:0:(osc_object.c:411:osc_req_attr_set()) LBUG To work around this bug in the kernel, we can simply check whether this page got truncated and was removed from page cache in ->readpage(), and return AOP_TRUNCATED_PAGE to the upper layer, and then it will retry to batch pages and it will not add this truncated page into batches as it was removed from page cache. |
| Comments |
| Comment by Gerrit Updater [ 19/Dec/22 ] |
|
"Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49433 |
| Comment by Gerrit Updater [ 20/Jan/23 ] |
|
"Patrick Farrell <farr0186@gmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49723 |
| Comment by Gerrit Updater [ 31/Jan/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49433/ |
| Comment by Gerrit Updater [ 03/Feb/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49723/ |
| Comment by Peter Jones [ 03/Feb/23 ] |
|
Landed for 2.16 |
| Comment by Andreas Dilger [ 13/Mar/23 ] |
|
Patch for the upstream kernel submitted: Accepted into kernel and nackported to 6.1 and 5.15 stable trees: This patch also ends up improving fxmark benchmark performance by 13%, likely due to avoiding extraneous reads of pages not actually requested by the application: |
| Comment by Gerrit Updater [ 13/Mar/23 ] |
|
"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50277 |
| Comment by Gerrit Updater [ 11/Apr/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50277/ |