Details
-
Bug
-
Resolution: Not a Bug
-
Minor
-
None
-
Lustre 2.1.0
-
None
-
$ uname -r
2.6.18-194.17.1.el5
$ cat /proc/fs/lustre/version
lustre: 2.0.59
kernel: patchless_client
build: jenkins-g3dcb5fb-PRISTINE-2.6.18-194.17.1.el5
-
3
-
10134
Description
We are seeing OOMs from readahead. There appear to be several issues:
1) Lustre readahead is insensitive to memory pressure. It would be nice to have something like max_sane_readahead().
2) Lustre readahead calls grab_cache_page_nowait() which allocates using the GFP mask of the file. So to allocate a cache page for readahead the GFP mask is HARDWALL|WAIT|IO|HIGHMEM, which is sufficient to trigger an OOM.
3) In 2.6.18-194.17.1.el5, grab_cache_page_nowait() also calls add_to_page_cache_lru() with mask GFP_KERNEL, also enough to cause an OOM, or to recurse into the filesystem. (In el6, GFP_KERNEL is changed to GFP_NOFS.)
It's easily reproduced by getting available memory under max_read_ahead_mb, and issuing a suitable read(). Under that reproducer, the OOM can be prevented by clearing __GFP_WAIT in grab_cache_page_nowait() and add_to_page_cache_lru(). I do not of a fix that does not modify the kernel.
See attached for the console logs from a client on a llmount.sh filesystem. But note that this issue is also frequently observed in production on TACC Lonestar (2.6.18-192.32.1/1.8.5).