[LU-17181] lu_sites_guard sem caused a page reclaim starvation. Created: 11/Oct/23 Updated: 08/Nov/23 Resolved: 08/Nov/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Alexey Lyashkov | Assignee: | Alexey Lyashkov |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Linux MM can run serval cache reclaim in parallel, PID: 98822 TASK: ffff9766015e0000 CPU: 11 COMMAND: "zabbix_agent2" #0 [ffffbeed4e617920] __schedule+708 at ffffffff9b54e1d4 #1 [ffffbeed4e6179b8] schedule+56 at ffffffff9b54e648 #2 [ffffbeed4e6179c8] rwsem_down_read_slowpath+864 at ffffffff9b5511d0 #3 [ffffbeed4e617a60] lu_cache_shrink_count+30 at ffffffffc0fb34fe [obdclass] #4 [ffffbeed4e617a70] do_shrink_slab+84 at ffffffff9ae74344 #5 [ffffbeed4e617ae0] shrink_slab+190 at ffffffff9ae74b6e #6 [ffffbeed4e617b60] shrink_node+412 at ffffffff9ae795ec #7 [ffffbeed4e617be0] do_try_to_free_pages+201 at ffffffff9ae79bb9 #8 [ffffbeed4e617c30] try_to_free_pages+239 at ffffffff9ae79fbf #9 [ffffbeed4e617cd0] __alloc_pages_slowpath+945 at ffffffff9aebd7b1 #10 [ffffbeed4e617dc8] __alloc_pages_nodemask+643 at ffffffff9aebe3a3 #11 [ffffbeed4e617e28] __get_free_pages+10 at ffffffff9aeb86ca vs PID: 98811 TASK: ffff977ba2865f00 CPU: 16 COMMAND: "p_check_lustre_" #0 [ffffbeed6cf3f828] __schedule+708 at ffffffff9b54e1d4 #1 [ffffbeed6cf3f8c0] preempt_schedule_common+10 at ffffffff9b54e6fa #2 [ffffbeed6cf3f8c8] _cond_resched+29 at ffffffff9b54e72d #3 [ffffbeed6cf3f8d0] mutex_lock+14 at ffffffff9b55087e #4 [ffffbeed6cf3f8e0] lod_striping_free+27 at ffffffffc1693a2b [lod] #5 [ffffbeed6cf3f900] lod_object_free+158 at ffffffffc169c43e [lod] #6 [ffffbeed6cf3f910] lu_object_free+216 at ffffffffc0fb2ed8 [obdclass] #7 [ffffbeed6cf3f978] lu_site_purge_objects+982 at ffffffffc0fb5d16 [obdclass] #8 [ffffbeed6cf3fa18] lu_cache_shrink_scan+146 at ffffffffc0fb5fe2 [obdclass] #9 [ffffbeed6cf3fa70] do_shrink_slab+300 at ffffffff9ae7441c #10 [ffffbeed6cf3fae0] shrink_slab+190 at ffffffff9ae74b6e |
| Comments |
| Comment by Gerrit Updater [ 11/Oct/23 ] |
|
"Alexey Lyashkov <alexey.lyashkov@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52627 |
| Comment by Gerrit Updater [ 08/Nov/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52627/ |
| Comment by Peter Jones [ 08/Nov/23 ] |
|
Landed for 2.16 |