[LU-9906] Allow Lustre page dropping to use pagevec_release Created: 23/Aug/17 Updated: 19/Dec/19 Resolved: 21/Nov/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.12.0, Lustre 2.10.7 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Patrick Farrell (Inactive) | Assignee: | Patrick Farrell (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | performance | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
When Lustre releases a lot of cached pages at once, it still calls page_release, instead of pagevec_release. When clearing OST ldlm lock lrus, the ldlm_bl threads end up spending much of their time contending for the zone lock taken by page_release. With many namespaces and parallel lru clearing (as Cray does at the end of each job), this can be a significant time sink. Using pagevec release is much better. Patch coming shortly. |
| Comments |
| Comment by Gerrit Updater [ 23/Aug/17 ] | ||||||||||||||||||||
|
Patrick Farrell (paf@cray.com) uploaded a new patch: https://review.whamcloud.com/28667 | ||||||||||||||||||||
| Comment by Patrick Farrell (Inactive) [ 21/Sep/17 ] | ||||||||||||||||||||
|
Quoting Andreas in Invalidate_page_range is something else, but I think this does what you're talking about. I don't think we can drop pages in such large chunks, pagevec_release is the best I'm aware of without writing our own. (And I wonder about holding the relevant lock long enough to drop stripe_size chunks.) | ||||||||||||||||||||
| Comment by Gerrit Updater [ 14/Dec/17 ] | ||||||||||||||||||||
|
Patrick Farrell (paf@cray.com) uploaded a new patch: https://review.whamcloud.com/30531 | ||||||||||||||||||||
| Comment by Gerrit Updater [ 14/Feb/18 ] | ||||||||||||||||||||
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30531/ | ||||||||||||||||||||
| Comment by Shuichi Ihara [ 15/Nov/18 ] | ||||||||||||||||||||
|
patch https://review.whamcloud.com/#/c/28667 gives huge contributions for single client performance improvements. Here is test results. 1 x client (2 x Intel Platinum 8160 CPU @ 2.10GHz, 192GB Memory) parameter lctl set_param osc.*.max_pages_per_rpc=16M osc.*.max_rpcs_in_flight=16 osc.*.max_dirty_mb=512 osc.*.checksums=0 llite.*.max_read_ahead_mb=2048 IOR command mpirun -np 48 ior -w -r -t 16m -b 16g -F -e -vv -o /scratch0/file -i 1 -B (O_DIRECT) mpirun -np 48 ior -w -r -t 16m -b 16g -F -e -vv -o /scratch0/file -i 1 (buffered)
| ||||||||||||||||||||
| Comment by Patrick Farrell (Inactive) [ 15/Nov/18 ] | ||||||||||||||||||||
|
That's really impressive. What kernel version are you running there? I'm curious specifically if you have queued spinlocks. I haven't looked at lru_reclaim specifically, but the other areas affected by this patch got much better with new kernel versions. (ie the patch is less important if you have queued spinlocks) | ||||||||||||||||||||
| Comment by Shuichi Ihara [ 15/Nov/18 ] | ||||||||||||||||||||
|
i'm testing on 3.10.0-693.21.1.el7.x86_64. cost reduction at discard_pagevec() is from 57.59% to 17.48% after patch. | ||||||||||||||||||||
| Comment by Patrick Farrell (Inactive) [ 15/Nov/18 ] | ||||||||||||||||||||
|
Huh! Thank you for the detailed look. I am surprised it's so large with the queued spinlocks, but I'm glad it's helping so much. Nice find. | ||||||||||||||||||||
| Comment by Andreas Dilger [ 15/Nov/18 ] | ||||||||||||||||||||
|
This is great. It shows that the performance is nearly identical for buffered and unbuffered large reads. It would seem like the next big user is osc_lru_alloc(), but it may be that looks like it is taking a lot of time because there is an enforced wait when there are not enough pages. Given that we are very close to peak performance for the reads, it probably makes more sense to focus on improving the write side. | ||||||||||||||||||||
| Comment by Gerrit Updater [ 21/Nov/18 ] | ||||||||||||||||||||
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/28667/ | ||||||||||||||||||||
| Comment by Peter Jones [ 21/Nov/18 ] | ||||||||||||||||||||
|
Landed for 2.12 | ||||||||||||||||||||
| Comment by Gerrit Updater [ 08/Jan/19 ] | ||||||||||||||||||||
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33988 | ||||||||||||||||||||
| Comment by Gerrit Updater [ 15/Feb/19 ] | ||||||||||||||||||||
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33988/ | ||||||||||||||||||||
| Comment by Patrick Farrell (Inactive) [ 15/Feb/19 ] | ||||||||||||||||||||
|
Landing just the OSD side patch to b2_10 is good here - It was required for some kernel compatibility changes ( There is no need to land the other patch from this ticket - https://review.whamcloud.com/28667/ |