Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Cannot Reproduce
Priority: Critical
Fix Version/s: None
Affects Version/s: Lustre 2.5.0
Labels:
- mq115
Environment:
Config: Single-node client+MDS+OSS with 1 MDT, 3 OSTs
Node: x86_64 w/ dual-core CPU, 2GB RAM
Kernel: 2.6.32-279.5.1.el6_lustre.g7f15218.x86_64
Lustre build: 72afa19c19d5ac

Severity:
3
Rank (Obsolete):
10870

Description

I'm trying to determine if there is a "memory leak" in the current Lustre code that can affect long-running clients or servers. While this memory may be cleaned up when the filesystem is unmounted, it does not appear to be cleaned up under steady-state usage.

I started "rundbench 10 -t 3600" and am watching the memory usage in several forms (slabtop, vmstat, "lfs df", "lfs df -i"). It does indeed appear that there are a number of statistics that show what looks to be a memory leak. These statistics are gathered at about the same time, but not exactly at the same time. The general trend is fairly clear, however:

The "lfs df -i" output shows only around 1000 in-use files during the whole run:

UUID                      Inodes       IUsed       IFree IUse% Mounted on
testfs-MDT0000_UUID       524288        1024      523264   0% /mnt/testfs[MDT:0]
testfs-OST0000_UUID       131072         571      130501   0% /mnt/testfs[OST:0]
testfs-OST0001_UUID       131072         562      130510   0% /mnt/testfs[OST:1]
testfs-OST0002_UUID       131072         576      130496   0% /mnt/testfs[OST:2]

filesystem summary:       524288        1024      523264   0% /mnt/testfs

The LDLM resource_count shows the number of locks, slightly less than 50k, but a lot more than the number of actual objects in the filesystem:

# lctl get_param ldlm.namespaces.*.resource_count
ldlm.namespaces.filter-testfs-OST0000_UUID.resource_count=238
ldlm.namespaces.filter-testfs-OST0001_UUID.resource_count=226
ldlm.namespaces.filter-testfs-OST0002_UUID.resource_count=237
ldlm.namespaces.mdt-testfs-MDT0000_UUID.resource_count=49161
ldlm.namespaces.testfs-MDT0000-mdc-ffff8800a66c1c00.resource_count=49160
ldlm.namespaces.testfs-OST0000-osc-ffff8800a66c1c00.resource_count=237
ldlm.namespaces.testfs-OST0001-osc-ffff8800a66c1c00.resource_count=226
ldlm.namespaces.testfs-OST0002-osc-ffff8800a66c1c00.resource_count=236

Total memory used (as shown by "vmstat") also shows a steady increase over time, originally 914116kB of free memory, down to 202036kB after about 3000s of the run so far (about 700MB of memory used), and eventually ends up at 86724kB at the end of the run (830MB used). While that would be normal with a workload that is accessing a large number of files that are kept in cache, the total amount of used space in the filesystem is steadily about 240MB during the entire run.

The "slabtop" output (edited to remove uninteresting slabs) shows over 150k and steadily growing number of allocated structures for CLIO, far more than could actually be in use at any given time. All of the CLIO slabs are 100% used, so it isn't just a matter of alloc/free causing partially-used slabs.

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
242660 242660 100%    0.19K  12133       20     48532K size-192
217260 217260 100%    0.19K  10863       20     43452K dentry
203463 178864  87%    0.10K   5499       37     21996K buffer_head
182000 181972  99%    0.03K   1625      112      6500K size-32
181530 181530 100%    0.12K   6051       30     24204K size-128
156918 156918 100%    1.25K  52306        3    209224K lustre_inode_cache
156840 156840 100%    0.12K   5228       30     20912K lov_oinfo
156825 156825 100%    0.22K   9225       17     36900K lov_object_kmem
156825 156825 100%    0.22K   9225       17     36900K lovsub_object_kmem
156816 156816 100%    0.24K   9801       16     39204K ccc_object_kmem
156814 156814 100%    0.27K  11201       14     44804K osc_object_kmem
123832 121832  98%    0.50K  15479        8     61916K size-512
 98210  92250  93%    0.50K  14030        7     56120K ldlm_locks
 97460  91009  93%    0.38K   9746       10     38984K ldlm_resources
 76320  76320 100%    0.08K   1590       48      6360K mdd_obj
 76262  76262 100%    0.11K   2243       34      8972K lod_obj
 76245  76245 100%    0.28K   5865       13     23460K mdt_obj
  2865   2764  96%    1.03K    955        3      3820K ldiskfs_inode_cache
  1746   1546  88%    0.21K     97       18       388K cl_lock_kmem 
  1396   1396 100%    1.00K    349        4      1396K ptlrpc_cache
  1345   1008  74%    0.78K    269        5      1076K shmem_inode_cache
  1298    847  65%    0.06K     22       59        88K lovsub_lock_kmem
  1224    898  73%    0.16K     51       24       204K ofd_obj
  1008    794  78%    0.18K     48       21       192K osc_lock_kmem
  1008    783  77%    0.03K      9      112        36K lov_lock_link_kmem
   925    782  84%    0.10K     25       37       100K lov_lock_kmem
   920    785  85%    0.04K     10       92        40K ccc_lock_kmem

The ldiskfs_inode_cache shows a reasonable number of objects in use, one for each MDT and OST inode actually in use. It might be that this is a leak of unlinked inodes/dentries on the client?

Now, after 3600s of running, the dbench has finished and deleted all of the files:

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    1229310     5.896  1056.405
 Close         903051     2.960  1499.813
 Rename         52083     8.024   827.129
 Unlink        248209     3.694   789.403
 Deltree           20   119.498   421.063
 Mkdir             10     0.050     0.155
 Qpathinfo    1114775     2.129   953.086
 Qfileinfo     195028     0.114    25.925
 Qfsinfo       204279     0.574    32.902
 Sfileinfo     100238    27.316  1442.888
 Find          430819     6.750  1369.539
 WriteX        611079     0.833   857.679
 ReadX        1927390     0.107  1171.947
 LockX           4004     0.005     1.899
 UnlockX         4004     0.003     3.345
 Flush          86164   183.254  2577.019

Throughput 10.6947 MB/sec  10 clients  10 procs  max_latency=2577.028 ms

The slabs still show a large number of allocations, even though no files exist in the filesystem anymore:

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
289880 133498  46%    0.19K  14494       20     57976K size-192
278768 274718  98%    0.03K   2489      112      9956K size-32
274410 259726  94%    0.12K   9147       30     36588K size-128
253590 250634  98%    0.12K   8453       30     33812K lov_oinfo
253555 250634  98%    0.22K  14915       17     59660K lovsub_object_kmem
253552 250634  98%    0.24K  15847       16     63388K ccc_object_kmem
253540 250634  98%    0.27K  18110       14     72440K osc_object_kmem
253538 250634  98%    0.22K  14914       17     59656K lov_object_kmem
252330 250638  99%    1.25K  84110        3    336440K lustre_inode_cache
203463 179392  88%    0.10K   5499       37     21996K buffer_head
128894 128446  99%    0.11K   3791       34     15164K lod_obj
128880 128446  99%    0.08K   2685       48     10740K mdd_obj
128869 128446  99%    0.28K   9913       13     39652K mdt_obj
 84574  79368  93%    0.50K  12082        7     48328K ldlm_locks
 82660  79314  95%    0.38K   8266       10     33064K ldlm_resources
 71780  50308  70%    0.19K   3589       20     14356K dentry

There are also still about 40k MDT locks, though all of the OST locks are gone (which is expected if these files are unlinked).

# lctl get_param ldlm.namespaces.*.resource_count
ldlm.namespaces.filter-testfs-OST0000_UUID.resource_count=0
ldlm.namespaces.filter-testfs-OST0001_UUID.resource_count=0
ldlm.namespaces.filter-testfs-OST0002_UUID.resource_count=0
ldlm.namespaces.mdt-testfs-MDT0000_UUID.resource_count=39654
ldlm.namespaces.testfs-MDT0000-mdc-ffff8800a66c1c00.resource_count=39654
ldlm.namespaces.testfs-OST0000-osc-ffff8800a66c1c00.resource_count=0
ldlm.namespaces.testfs-OST0001-osc-ffff8800a66c1c00.resource_count=0
ldlm.namespaces.testfs-OST0002-osc-ffff8800a66c1c00.resource_count=0

Attachments

Issue Links

duplicates

LU-3771 stuck 56G of SUnreclaim memory

Resolved

is duplicated by

LU-4754 MDS large amount of slab usage

Resolved

is related to

LU-4033 Failure on test suite parallel-scale-nfsv4 test_iorssf: MDS oom

Resolved

LU-4740 MDS - buffer cache not freed

Resolved

LU-3997 Excessive slab usage causes large mem & core count clients to hang

Resolved

LU-4429 clients leaking open handles/bad lock matching in ll_md_blocking_ast

Resolved

LU-4754 MDS large amount of slab usage

Resolved

LU-4002 HSM restore vs unlink deadlock

Resolved

is related to

LU-4357 page allocation failure. mode:0x40 caused by missing __GFP_WAIT flag

Resolved

LU-2487 2.2 Client deadlock between ll_md_blocking_ast, sys_close, and sys_open

Resolved

(3 is related to, 2 is related to )

Activity

People

Assignee:: Niu Yawei (Inactive)

Reporter:: Andreas Dilger (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 24 Start watching this issue

Dates

Created:: 02/Oct/13 11:36 PM

Updated:: 05/Aug/20 9:59 PM

Resolved:: 05/Aug/20 9:59 PM