Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.5.0, Lustre 2.4.3
-
Clients:
Endeavour: 2.4.3, ldan: 2.4.1, Pleiades compute nodes: 2.1.5 or 2.4.1
Servers:
2.1.5, 2.4.1, 2.4.3
-
2
-
14374
Description
We have been seeing our SLES11SP2 and SLES11SP3 clients have stuck anonymous memory that cannot be cleared up without a reboot. We have three test cases which can replicate the problem reliably. We have been able to replicate the problem on different clients on all of our lustre file systems. We have not been able to reproduce the problem when using NFS, ext3, CXFS, or tmpfs.
We have been working with SGI on tracking down this problem. Unfortunately, they have been unable to reproduce the problem on their systems. On our systems, they have simplified the test case to mmaping a file along with an equally sized anonymous region, and reading the contents of the mmaped file into the anonymous mmaped region. This test case can be provided to see if you can reproduce this problem.
To determine if the problem is occurring, reboot the system to ensure that memory is clean. Check /proc/meminfo for the amount of Active(anon) memory being used. Run the test case. During the test case, the amount of anonymous memory will increase. At the end of the test case, it would be expected for the amount to drop back to pre-test case levels.
To confirm that the anonymous memory is stuck, we have been using memhog to attempt to allocate memory. If the node has 32Gb of memory, with 2Gb of anonymous memory used, we attempt to allocate 31Gb of memory. If memhog completes and you then have only 1Gb of anonymous memory, you have not reproduced the problem. If memhog is killed, you have.
SGI would like to get information about how to get debug information to track down this problem.
ldan2 and ldan3 below are fairly ordinary self-contained systems running
Lustre 2.4.1-6nas_ofed154 client. This problem has been reproduced on several versions of the NAS Lustre client and server software.
Log into ldan2. (I've mostly used a qsub session, but I have reproduced the problem outside of PBS)
Log into ldan3. ( I have special permission to log into an ldan I don't
have a PBS job running on)
On both systems cd to the test (lustre) directory
In this directory exist the following:
1. a copy of hedi's test, I've tested with the first one he wrote and a fairly
late version(mmap4.c). The later version (attached) is more flexible.
2. a 1g file created with:
dd if=/dev/zero of=1g bs=4096 count=262144
#
#
On ldan3 run:
dd count=1 bs=1 conv=notrunc of=1g if=/dev/zero
On ldan2 run:
./mmap 1g
After the mmap program terminates, notice that the anonymous memory
used by mmap remains in memory. I've never been able to force
persistent Anonymous memory out of the greater Linux virtual memory
system (memory + swap). Anonymous memory swapped to disk is not
accounted for as "Anonymous Memory", but it is accounted for as swap.
The problem does not reproduce if the dd "interference" is not run.
The problem does not reproduce if the dd "interference" is run and then
lflush is run, or the file is read (w/o mmap) from a third system. The problem does not reproduce if the dd "interference" is run on ldan2, then the mmap test is run on ldan2.