Details
-
Bug
-
Resolution: Won't Fix
-
Major
-
None
-
Lustre 2.1.2
-
None
-
RHEL Server 6.4, kernel 2.6.32-279.9.1.el6.x86_64, 32GB ECC RAM, 16GB swap, infiniband
-
2
-
6926
Description
When using mmap() on files on a Lustre FS (v2.1.2), the Linux kernel sometimes invokes the out-of-memory killer, even when the system has most of its memory free. The attached kernel OOM log shows an example of a system that crashed with >20GB memory free and 99% of the 16GB of swap unused.
It can take several hours before the OOM killer is triggered under normal mmap() usage. To help with debugging, I've found that the following source code can cause an instantaneous OOM condition if run from a directory on a Lustre FS, even though mmap() is not correctly used in the source code. (The appropriate behavior would be for the kernel to terminate the code, not cause an OOM condition). This may help identify the problem in the code path which is causing the main OOM issue in production mmap() usage.
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/mman.h> #include <inttypes.h> int main(void) { int64_t i, j, size=0, memsize=0; FILE *o; int fd; void *mem = NULL, *mmem = NULL; o = fopen("test.dat", "w+"); fd = fileno(o); for (i=0; i<1e6; i++) { size += 100000; //Allocates up to 100GB for mmap()ed file memsize += 3000; //Allocates up to 3GB for in-memory usage //ftruncate(fd, size); //<--- this would be the correct usage mmem = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); mem = realloc(mem, memsize); memset(mem, i%256, memsize); memset(mmem+size-100000, i%256, 100000); munmap(mmem, size); } return 0; }