Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.14.0, Lustre 2.16.1
-
None
-
3
-
9223372036854775807
Description
Running "lfs migrate" or "lfs mirror resync" on a file that is much larger than RAM can cause the thread to livelock in the kernel under memory pressure, causing "lfs" to spin with 100% CPU usage trying to access buffer pages to do the migrate/mirror copy:
[<0>] do_swap_page+0xaf/0x790 [<0>] __handle_mm_fault+0x552/0x6d0 [<0>] handle_mm_fault+0xca/0x2a0 [<0>] __get_user_pages+0x250/0x830 [<0>] get_user_pages_unlocked+0xd5/0x2a0 [<0>] internal_get_user_pages_fast+0x193/0x2c0 [<0>] iov_iter_get_pages_alloc+0x110/0x4c0 [<0>] ll_direct_IO_impl+0x30f/0xc50 [lustre] [<0>] generic_file_read_iter+0x8f/0x150 [<0>] vvp_io_read_start+0x597/0x840 [lustre] [<0>] cl_io_start+0x5d/0x110 [obdclass] [<0>] cl_io_loop+0x9a/0x200 [obdclass] [<0>] ll_file_io_generic+0xa83/0xf90 [lustre] [<0>] ll_file_read_iter+0x9de/0xd20 [lustre] [<0>] new_sync_read+0x10f/0x160 [<0>] vfs_read+0x91/0x150 [<0>] ksys_pread64+0x65/0xa0 [<0>] do_syscall_64+0x5b/0x1a0
This can happen if run on the OSS where an object being read or written is loading pages into cache, or if there are another process(es) (e.g. "lfs mirror extend" calling mirror_extend_file() that does not open files with O_DIRECT) that are reading into the client page cache.
It appears that the buffer pages used by migrate_copy_data() for both migrate and resync get swapped out under pressure and cannot be faulted in by the kernel.
This was easily and repeatedly reproduced on a client-on-OSS node running el8.10 4.18.0-553.50.1 kernel with 4GB RAM migrating a 30GB file, but also on a standalone client with 128GB RAM running 40 copies of "lfs mirror extend" (buffered IO) and "lfs mirror resync" (direct IO) on separate files of course, with some of the files over 2TB.