Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
vvp_set_pagevec_dirty
locks mapping->page_tree and integrates over pages
lock_page_memcg
<work>
unock_page_memcg
vvp_page_completion_write
calls lock_page_memcg
and then locks mapping->page_tree
This can cause an extended lock or deadlock with vvp_set_pagevec_dirty
One core spinning here
#0 [ffffc9000b97fa78] _raw_spin_lock_irqsave at ffffffff815b65e9 /home/abuild/rpmbuild/BUILD/kernel-cray_ari_c-4.12.14/linux-4.12.14/linux-obj/../kernel/locking/spinlock.c: 160 #1 [ffffc9000b97fa98] lock_page_memcg at ffffffff811d7a89 /home/abuild/rpmbuild/BUILD/kernel-cray_ari_c-4.12.14/linux-4.12.14/linux-obj/../mm/memcontrol.c: 1695 #2 [ffffc9000b97fac0] test_clear_page_writeback at ffffffff8117c479 /home/abuild/rpmbuild/BUILD/kernel-cray_ari_c-4.12.14/linux-4.12.14/linux-obj/../mm/page-writeback.c: 2780 #3 [ffffc9000b97fb10] end_page_writeback at ffffffff8116a657 /home/abuild/rpmbuild/BUILD/kernel-cray_ari_c-4.12.14/linux-4.12.14/linux-obj/../mm/filemap.c: 1273 #4 [ffffc9000b97fb28] vvp_page_completion_write at ffffffffa0816341 [lustre] /home/abuild/rpmbuild/BUILD/cray-lustre-2.12.0.5_cray_290_gdd6781b/lustre/llite/vvp_page.c: 316 #5 [ffffc9000b97fb58] cl_page_completion at ffffffffa0504663 [obdclass] /home/abuild/rpmbuild/BUILD/cray-lustre-2.12.0.5_cray_290_gdd6781b/lustre/obdclass/cl_page.c: 931
With many cores spinning here:
_raw_spin_lock_irqsave+0x39/0x50 vvp_set_pagevec_dirty+0x97/0x3a0 [lustre] write_commit_callback+0x64/0x1a0 [lustre] osc_queue_async_io+0x910/0x18e0 [osc] ? vvp_set_pagevec_dirty+0x3a0/0x3a0 [lustre] ? vvp_set_pagevec_dirty+0x3a0/0x3a0 [lustre] osc_page_cache_add+0x5f/0x180 [osc] osc_io_commit_async+0x2a0/0x500 [osc] ? vvp_set_pagevec_dirty+0x3a0/0x3a0 [lustre] ? vvp_set_pagevec_dirty+0x3a0/0x3a0 [lustre] cl_io_commit_async+0xa9/0x150 [obdclass] ? vvp_set_pagevec_dirty+0x3a0/0x3a0 [lustre] lov_io_commit_async+0x106/0x580 [lov] ? vvp_set_pagevec_dirty+0x3a0/0x3a0 [lustre] ? vvp_set_pagevec_dirty+0x3a0/0x3a0 [lustre] cl_io_commit_async+0xa9/0x150 [obdclass] vvp_io_write_commit+0x157/0x5e0 [lustre] vvp_io_write_start+0x6ac/0x8b0 [lustre] cl_io_start+0x6e/0x120 [obdclass] cl_io_loop+0xca/0x1c0 [obdclass] ll_file_io_generic+0x3c9/0xdd0 [lustre] ll_file_write_iter+0x124/0x630 [lustre]
Would explain the RCU stall where the memcg and tree_lock/i_pages order is inverted.