<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:39:14 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4053] client leaking objects/locks during IO</title>
                <link>https://jira.whamcloud.com/browse/LU-4053</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I&apos;m trying to determine if there is a &quot;memory leak&quot; in the current Lustre code that can affect long-running clients or servers.  While this memory may be cleaned up when the filesystem is unmounted, it does not appear to be cleaned up under steady-state usage.&lt;/p&gt;

&lt;p&gt;I started &quot;rundbench 10 -t 3600&quot; and am watching the memory usage in several forms (slabtop, vmstat, &quot;lfs df&quot;, &quot;lfs df -i&quot;).  It does indeed appear that there are a number of statistics that show what looks to be a memory leak.  These statistics are gathered at &lt;em&gt;about&lt;/em&gt; the same time, but not &lt;em&gt;exactly&lt;/em&gt; at the same time.  The general trend is fairly clear, however:&lt;/p&gt;

&lt;p&gt;The &quot;&lt;tt&gt;lfs df -i&lt;/tt&gt;&quot; output shows only around 1000 in-use files during the whole run:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;UUID                      Inodes       IUsed       IFree IUse% Mounted on
testfs-MDT0000_UUID       524288        1024      523264   0% /mnt/testfs[MDT:0]
testfs-OST0000_UUID       131072         571      130501   0% /mnt/testfs[OST:0]
testfs-OST0001_UUID       131072         562      130510   0% /mnt/testfs[OST:1]
testfs-OST0002_UUID       131072         576      130496   0% /mnt/testfs[OST:2]

filesystem summary:       524288        1024      523264   0% /mnt/testfs
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The LDLM resource_count shows the number of locks, slightly less than 50k, but a lot more than the number of actual objects in the filesystem:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lctl get_param ldlm.namespaces.*.resource_count
ldlm.namespaces.filter-testfs-OST0000_UUID.resource_count=238
ldlm.namespaces.filter-testfs-OST0001_UUID.resource_count=226
ldlm.namespaces.filter-testfs-OST0002_UUID.resource_count=237
ldlm.namespaces.mdt-testfs-MDT0000_UUID.resource_count=49161
ldlm.namespaces.testfs-MDT0000-mdc-ffff8800a66c1c00.resource_count=49160
ldlm.namespaces.testfs-OST0000-osc-ffff8800a66c1c00.resource_count=237
ldlm.namespaces.testfs-OST0001-osc-ffff8800a66c1c00.resource_count=226
ldlm.namespaces.testfs-OST0002-osc-ffff8800a66c1c00.resource_count=236
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Total memory used (as shown by &quot;&lt;tt&gt;vmstat&lt;/tt&gt;&quot;) also shows a steady increase over time, originally 914116kB of free memory, down to 202036kB after about 3000s of the run so far (about 700MB of memory used), and eventually ends up at 86724kB at the end of the run (830MB used).  While that would be normal with a workload that is accessing a large number of files that are kept in cache, the total amount of used space in the filesystem is steadily about 240MB during the entire run.&lt;/p&gt;

&lt;p&gt;The &quot;&lt;tt&gt;slabtop&lt;/tt&gt;&quot; output (edited to remove uninteresting slabs) shows over 150k and steadily growing number of allocated structures for CLIO, far more than could actually be in use at any given time.  All of the CLIO slabs are 100% used, so it isn&apos;t just a matter of alloc/free causing partially-used slabs.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
242660 242660 100%    0.19K  12133       20     48532K size-192
217260 217260 100%    0.19K  10863       20     43452K dentry
203463 178864  87%    0.10K   5499       37     21996K buffer_head
182000 181972  99%    0.03K   1625      112      6500K size-32
181530 181530 100%    0.12K   6051       30     24204K size-128
156918 156918 100%    1.25K  52306        3    209224K lustre_inode_cache
156840 156840 100%    0.12K   5228       30     20912K lov_oinfo
156825 156825 100%    0.22K   9225       17     36900K lov_object_kmem
156825 156825 100%    0.22K   9225       17     36900K lovsub_object_kmem
156816 156816 100%    0.24K   9801       16     39204K ccc_object_kmem
156814 156814 100%    0.27K  11201       14     44804K osc_object_kmem
123832 121832  98%    0.50K  15479        8     61916K size-512
 98210  92250  93%    0.50K  14030        7     56120K ldlm_locks
 97460  91009  93%    0.38K   9746       10     38984K ldlm_resources
 76320  76320 100%    0.08K   1590       48      6360K mdd_obj
 76262  76262 100%    0.11K   2243       34      8972K lod_obj
 76245  76245 100%    0.28K   5865       13     23460K mdt_obj
  2865   2764  96%    1.03K    955        3      3820K ldiskfs_inode_cache
  1746   1546  88%    0.21K     97       18       388K cl_lock_kmem 
  1396   1396 100%    1.00K    349        4      1396K ptlrpc_cache
  1345   1008  74%    0.78K    269        5      1076K shmem_inode_cache
  1298    847  65%    0.06K     22       59        88K lovsub_lock_kmem
  1224    898  73%    0.16K     51       24       204K ofd_obj
  1008    794  78%    0.18K     48       21       192K osc_lock_kmem
  1008    783  77%    0.03K      9      112        36K lov_lock_link_kmem
   925    782  84%    0.10K     25       37       100K lov_lock_kmem
   920    785  85%    0.04K     10       92        40K ccc_lock_kmem
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The ldiskfs_inode_cache shows a reasonable number of objects in use, one for each MDT and OST inode actually in use.  It might be that this is a leak of unlinked inodes/dentries on the client?  &lt;/p&gt;

&lt;p&gt;Now, after 3600s of running, the dbench has finished and deleted all of the files:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt; Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    1229310     5.896  1056.405
 Close         903051     2.960  1499.813
 Rename         52083     8.024   827.129
 Unlink        248209     3.694   789.403
 Deltree           20   119.498   421.063
 Mkdir             10     0.050     0.155
 Qpathinfo    1114775     2.129   953.086
 Qfileinfo     195028     0.114    25.925
 Qfsinfo       204279     0.574    32.902
 Sfileinfo     100238    27.316  1442.888
 Find          430819     6.750  1369.539
 WriteX        611079     0.833   857.679
 ReadX        1927390     0.107  1171.947
 LockX           4004     0.005     1.899
 UnlockX         4004     0.003     3.345
 Flush          86164   183.254  2577.019

Throughput 10.6947 MB/sec  10 clients  10 procs  max_latency=2577.028 ms
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The slabs still show a large number of allocations, even though no files exist in the filesystem anymore:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
289880 133498  46%    0.19K  14494       20     57976K size-192
278768 274718  98%    0.03K   2489      112      9956K size-32
274410 259726  94%    0.12K   9147       30     36588K size-128
253590 250634  98%    0.12K   8453       30     33812K lov_oinfo
253555 250634  98%    0.22K  14915       17     59660K lovsub_object_kmem
253552 250634  98%    0.24K  15847       16     63388K ccc_object_kmem
253540 250634  98%    0.27K  18110       14     72440K osc_object_kmem
253538 250634  98%    0.22K  14914       17     59656K lov_object_kmem
252330 250638  99%    1.25K  84110        3    336440K lustre_inode_cache
203463 179392  88%    0.10K   5499       37     21996K buffer_head
128894 128446  99%    0.11K   3791       34     15164K lod_obj
128880 128446  99%    0.08K   2685       48     10740K mdd_obj
128869 128446  99%    0.28K   9913       13     39652K mdt_obj
 84574  79368  93%    0.50K  12082        7     48328K ldlm_locks
 82660  79314  95%    0.38K   8266       10     33064K ldlm_resources
 71780  50308  70%    0.19K   3589       20     14356K dentry
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There are also still about 40k MDT locks, though all of the OST locks are gone (which is expected if these files are unlinked).&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lctl get_param ldlm.namespaces.*.resource_count
ldlm.namespaces.filter-testfs-OST0000_UUID.resource_count=0
ldlm.namespaces.filter-testfs-OST0001_UUID.resource_count=0
ldlm.namespaces.filter-testfs-OST0002_UUID.resource_count=0
ldlm.namespaces.mdt-testfs-MDT0000_UUID.resource_count=39654
ldlm.namespaces.testfs-MDT0000-mdc-ffff8800a66c1c00.resource_count=39654
ldlm.namespaces.testfs-OST0000-osc-ffff8800a66c1c00.resource_count=0
ldlm.namespaces.testfs-OST0001-osc-ffff8800a66c1c00.resource_count=0
ldlm.namespaces.testfs-OST0002-osc-ffff8800a66c1c00.resource_count=0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>Config: Single-node client+MDS+OSS with 1 MDT, 3 OSTs&lt;br/&gt;
Node: x86_64 w/ dual-core CPU, 2GB RAM&lt;br/&gt;
Kernel: 2.6.32-279.5.1.el6_lustre.g7f15218.x86_64&lt;br/&gt;
Lustre build: 72afa19c19d5ac</environment>
        <key id="21245">LU-4053</key>
            <summary>client leaking objects/locks during IO</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="niu">Niu Yawei</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                            <label>mq115</label>
                    </labels>
                <created>Wed, 2 Oct 2013 23:36:09 +0000</created>
                <updated>Wed, 5 Aug 2020 21:59:10 +0000</updated>
                            <resolved>Wed, 5 Aug 2020 21:59:10 +0000</resolved>
                                    <version>Lustre 2.5.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>24</watches>
                                                                            <comments>
                            <comment id="68255" author="adilger" created="Thu, 3 Oct 2013 16:19:29 +0000"  >&lt;p&gt;After unmounting the client, a large number of slabs have been cleaned up, but not all of them:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
289460 132395  45%    0.19K  14473       20     57892K size-192
203463 179354  88%    0.10K   5499       37     21996K buffer_head
128894 127934  99%    0.11K   3791       34     15164K lod_obj
128880 127934  99%    0.08K   2685       48     10740K mdd_obj
128869 127934  99%    0.28K   9913       13     39652K mdt_obj
 71760  50205  69%    0.19K   3588       20     14352K dentry
  1491   1176  78%    1.03K    497        3      1988K ldiskfs_inode_cache
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I ran with +malloc debug during the cleanup, and processed the debug log through leak_finder.pl.  A sample of the logs:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;*** Free without malloc (8 bytes at ffff880049c5b7e0, lov_object.c:lov_fini_raid0:377)
*** Free without malloc (112 bytes at ffff88003fcfe100, lov_ea.c:lsm_free_plain:130)
*** Free without malloc (80 bytes at ffff8800c3b630c0, lov_ea.c:lsm_free_plain:132)
*** Free without malloc (224 bytes at ffff88004d28b3f8, lov_object.c:lov_object_free:821)
*** Free without malloc (248 bytes at ffff88004d79c070, lcommon_cl.c:ccc_object_free:404)
*** Free without malloc (1256 bytes at ffff88002c195580, super25.c:ll_destroy_inode:80)
*** Free without malloc (272 bytes at ffff8800250ba9f8, osc_object.c:osc_object_free:128)
*** Free without malloc (224 bytes at ffff880036c25bd8, lovsub_object.c:lovsub_object_free:96)
*** Free without malloc (8 bytes at ffff880068e18bc0, lov_object.c:lov_fini_raid0:377)
*** Free without malloc (112 bytes at ffff88001d6efe00, lov_ea.c:lsm_free_plain:130)
*** Free without malloc (80 bytes at ffff8800b89ff940, lov_ea.c:lsm_free_plain:132)
*** Free without malloc (224 bytes at ffff88006e2b6e78, lov_object.c:lov_object_free:821)
*** Free without malloc (248 bytes at ffff88004d609830, lcommon_cl.c:ccc_object_free:404)
*** Free without malloc (1256 bytes at ffff8800baa22580, super25.c:ll_destroy_inode:80)
*** Free without malloc (272 bytes at ffff88004a7c4398, osc_object.c:osc_object_free:128)
*** Free without malloc (224 bytes at ffff88002515dbd8, lovsub_object.c:lovsub_object_free:96)
*** Free without malloc (8 bytes at ffff88006556d820, lov_object.c:lov_fini_raid0:377)
*** Free without malloc (112 bytes at ffff88003fcfe300, lov_ea.c:lsm_free_plain:130)
*** Free without malloc (80 bytes at ffff8800d3f239c0, lov_ea.c:lsm_free_plain:132)
*** Free without malloc (224 bytes at ffff88004d28b4d8, lov_object.c:lov_object_free:821)
*** Free without malloc (248 bytes at ffff88004d79c640, lcommon_cl.c:ccc_object_free:404)
*** Free without malloc (1256 bytes at ffff88006672d080, super25.c:ll_destroy_inode:80)
*** Free without malloc (272 bytes at ffff88004a7c44a8, osc_object.c:osc_object_free:128)
*** Free without malloc (224 bytes at ffff88002515dcb8, lovsub_object.c:lovsub_object_free:96)
*** Free without malloc (8 bytes at ffff880049c5b2a0, lov_object.c:lov_fini_raid0:377)
:
:
*** Free without malloc (320 bytes at ffff880081a96b00, ldlm_resource.c:ldlm_resource_putref_locked:1300)
*** Free without malloc (320 bytes at ffff88001ae75200, ldlm_resource.c:ldlm_resource_putref_locked:1300)
*** Free without malloc (320 bytes at ffff880024e4e980, ldlm_resource.c:ldlm_resource_putref_locked:1300)
*** Free without malloc (320 bytes at ffff880031a0c980, ldlm_resource.c:ldlm_resource_putref_locked:1300)
*** Free without malloc (320 bytes at ffff88002e66ec80, ldlm_resource.c:ldlm_resource_putref_locked:1300)
*** Free without malloc (320 bytes at ffff880023b68680, ldlm_resource.c:ldlm_resource_putref_locked:1300)
:
:
*** Free without malloc (504 bytes at ffff8800a67faa80, ldlm_lock.c:lock_handle_free:456)
*** Free without malloc (504 bytes at ffff880077984580, ldlm_lock.c:lock_handle_free:456)
*** Free without malloc (504 bytes at ffff88006e240380, ldlm_lock.c:lock_handle_free:456)
*** Free without malloc (504 bytes at ffff8800670bd900, ldlm_lock.c:lock_handle_free:456)
*** Free without malloc (504 bytes at ffff8800a3fb4880, ldlm_lock.c:lock_handle_free:456)
*** Free without malloc (504 bytes at ffff880058eaaa80, ldlm_lock.c:lock_handle_free:456)
*** Free without malloc (504 bytes at ffff8800c35aad80, ldlm_lock.c:lock_handle_free:456)
:
:
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;These are allocations that are being freed that were allocated before logging was enabled.  Many thousands of these lines...&lt;/p&gt;</comment>
                            <comment id="68262" author="adilger" created="Thu, 3 Oct 2013 16:35:00 +0000"  >&lt;p&gt;The dcache shrinking patch was disabled in &lt;a href=&quot;http://review.whamcloud.com/1874&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/1874&lt;/a&gt; in v2_2_59_0-3-g9f3469f, but needs to be fixed somehow (e.g. have a zombie list that is cleaned up outside of the dcache lock).&lt;/p&gt;</comment>
                            <comment id="68348" author="jay" created="Fri, 4 Oct 2013 00:32:55 +0000"  >&lt;p&gt;is this because the debug buffer overflowed so that it couldn&apos;t catch the allocation info?&lt;/p&gt;</comment>
                            <comment id="68349" author="adilger" created="Fri, 4 Oct 2013 01:03:07 +0000"  >&lt;p&gt;No, just because I didn&apos;t have +malloc debugging enabled while the test was running, and because there is a good chance that the allocation is not very close to the free in the first place, so it would be mismatched without a huge debug buffer.&lt;/p&gt;

&lt;p&gt;Since this test is so easy to run (sh llmount.sh; sh rundbench -t 3600 10) it is easy for anyone to get whatever information they need to debug it. &lt;/p&gt;</comment>
                            <comment id="68487" author="bfaccini" created="Mon, 7 Oct 2013 12:27:09 +0000"  >&lt;p&gt;Andreas, thanks for you hints. I started to work on this. &lt;br/&gt;
On the other hand, I asked some of my contacts on customers sites running 2.1.6, and they don&apos;t see this on idle nodes after heavy production work-loads.&lt;/p&gt;</comment>
                            <comment id="68489" author="bfaccini" created="Mon, 7 Oct 2013 13:14:49 +0000"  >&lt;p&gt;This behavior does not show-up with 1.8.9-wc1 but still does running last master builds.&lt;br/&gt;
Seems that &quot;echo 3 &amp;gt; /proc/sys/vm/drop_caches&quot; (without umount) allows to clear the Client-side allocs AND the MDS ones.&lt;/p&gt;</comment>
                            <comment id="68652" author="niu" created="Wed, 9 Oct 2013 04:26:05 +0000"  >&lt;blockquote&gt;
&lt;p&gt;The LDLM resource_count shows the number of locks, slightly less than 50k, but a lot more than the number of actual objects in the filesystem:&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;It because the layout lock wasn&apos;t canceled on unlink &amp;amp; rename.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Total memory used (as shown by &quot;vmstat&quot;) also shows a steady increase over time, originally 914116kB of free memory, down to 202036kB after about 3000s of the run so far (about 700MB of memory used), and eventually ends up at 86724kB at the end of the run (830MB used). While that would be normal with a workload that is accessing a large number of files that are kept in cache, the total amount of used space in the filesystem is steadily about 240MB during the entire run.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;I think the memory was consumed by slab cache.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The &quot;slabtop&quot; output (edited to remove uninteresting slabs) shows over 150k and steadily growing number of allocated structures for CLIO, far more than could actually be in use at any given time. All of the CLIO slabs are 100% used, so it isn&apos;t just a matter of alloc/free causing partially-used slabs.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;I guess it because of the unlink/rename, so huge number of slab objects for CLIO object were created (and I don&apos;t quite sure what&apos;s the ACTIVE / USE in slabtop mean)?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The slabs still show a large number of allocations, even though no files exist in the filesystem anymore:&lt;br/&gt;
 OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   &lt;br/&gt;
289880 133498  46%    0.19K  14494       20     57976K size-192&lt;br/&gt;
278768 274718  98%    0.03K   2489      112      9956K size-32&lt;br/&gt;
274410 259726  94%    0.12K   9147       30     36588K size-128&lt;br/&gt;
253590 250634  98%    0.12K   8453       30     33812K lov_oinfo&lt;br/&gt;
253555 250634  98%    0.22K  14915       17     59660K lovsub_object_kmem&lt;br/&gt;
253552 250634  98%    0.24K  15847       16     63388K ccc_object_kmem&lt;br/&gt;
253540 250634  98%    0.27K  18110       14     72440K osc_object_kmem&lt;br/&gt;
253538 250634  98%    0.22K  14914       17     59656K lov_object_kmem&lt;br/&gt;
252330 250638  99%    1.25K  84110        3    336440K lustre_inode_cache&lt;br/&gt;
203463 179392  88%    0.10K   5499       37     21996K buffer_head&lt;br/&gt;
128894 128446  99%    0.11K   3791       34     15164K lod_obj&lt;br/&gt;
128880 128446  99%    0.08K   2685       48     10740K mdd_obj&lt;br/&gt;
128869 128446  99%    0.28K   9913       13     39652K mdt_obj&lt;br/&gt;
 84574  79368  93%    0.50K  12082        7     48328K ldlm_locks&lt;br/&gt;
 82660  79314  95%    0.38K   8266       10     33064K ldlm_resources&lt;br/&gt;
 71780  50308  70%    0.19K   3589       20     14356K dentry&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;I think lustre has released all objects, it&apos;s slab cache which holding these objects and it depends on kernel to decided when to free them to reclaim memroy.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;After unmounting the client, a large number of slabs have been cleaned up, but not all of them:&lt;br/&gt;
 OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   &lt;br/&gt;
289460 132395  45%    0.19K  14473       20     57892K size-192&lt;br/&gt;
203463 179354  88%    0.10K   5499       37     21996K buffer_head&lt;br/&gt;
128894 127934  99%    0.11K   3791       34     15164K lod_obj&lt;br/&gt;
128880 127934  99%    0.08K   2685       48     10740K mdd_obj&lt;br/&gt;
128869 127934  99%    0.28K   9913       13     39652K mdt_obj&lt;br/&gt;
 71760  50205  69%    0.19K   3588       20     14352K dentry&lt;br/&gt;
  1491   1176  78%    1.03K    497        3      1988K ldiskfs_inode_cache&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;We can see all the client&apos;s slabs have been freed, these are all for servers.&lt;/p&gt;


&lt;p&gt;One thing confused me is that after dbench finished, there are still huge number of layout lock cached:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;00010000:00010000:1.0:1381289786.413195:0:11603:0:(ldlm_resource.c:1448:ldlm_resource_dump()) --- Resource: [0x200000400:0x1cd83:0x0].0 (ffff880019c2d340) refcount = 2
00010000:00010000:1.0:1381289786.413196:0:11603:0:(ldlm_resource.c:1451:ldlm_resource_dump()) Granted locks (in reverse order):
00010000:00010000:1.0:1381289786.413196:0:11603:0:(ldlm_resource.c:1454:ldlm_resource_dump()) ### ### ns: lustre-MDT0000-mdc-ffff8800265a9800 lock: ffff88004fdb0b80/0x68a3d67bc8651524 lrc: 1/0,0 mode: CR/CR res: [0x200000400:0x1cd83:0x0].0 bits 0x8 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x68a3d67bc865152b expref: -99 pid: 10969 timeout: 0 lvb_type: 3
00010000:00010000:1.0:1381289786.413198:0:11603:0:(ldlm_resource.c:1448:ldlm_resource_dump()) --- Resource: [0x200000400:0x11d83:0x0].0 (ffff880070efba80) refcount = 2
00010000:00010000:1.0:1381289786.413198:0:11603:0:(ldlm_resource.c:1451:ldlm_resource_dump()) Granted locks (in reverse order):
00010000:00010000:1.0:1381289786.413199:0:11603:0:(ldlm_resource.c:1454:ldlm_resource_dump()) ### ### ns: lustre-MDT0000-mdc-ffff8800265a9800 lock: ffff8800681ac380/0x68a3d67bc691ef1e lrc: 1/0,0 mode: CR/CR res: [0x200000400:0x11d83:0x0].0 bits 0x8 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x68a3d67bc691ef3a expref: -99 pid: 10969 timeout: 0 lvb_type: 3
00010000:00010000:1.0:1381289786.413200:0:11603:0:(ldlm_resource.c:1448:ldlm_resource_dump()) --- Resource: [0x200000400:0x15143:0x0].0 (ffff880046767e80) refcount = 2
00010000:00010000:1.0:1381289786.413201:0:11603:0:(ldlm_resource.c:1451:ldlm_resource_dump()) Granted locks (in reverse order):
00010000:00010000:1.0:1381289786.413201:0:11603:0:(ldlm_resource.c:1454:ldlm_resource_dump()) ### ### ns: lustre-MDT0000-mdc-ffff8800265a9800 lock: ffff8800115848c0/0x68a3d67bc71e9503 lrc: 1/0,0 mode: CR/CR res: [0x200000400:0x15143:0x0].0 bits 0x8 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x68a3d67bc71e951f expref: -99 pid: 10969 timeout: 0 lvb_type: 3
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I checked the server code and see that layout lock isn&apos;t revoked on unlink/rename (see  &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4002&quot; title=&quot;HSM restore vs unlink deadlock &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4002&quot;&gt;&lt;del&gt;LU-4002&lt;/del&gt;&lt;/a&gt; hsm: avoid layout lock on unlink and rename onto), so the layout locks were cached on client even after all files removed.&lt;/p&gt;</comment>
                            <comment id="68653" author="niu" created="Wed, 9 Oct 2013 05:24:50 +0000"  >&lt;p&gt;What suprised me is that even I revert the change of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4002&quot; title=&quot;HSM restore vs unlink deadlock &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4002&quot;&gt;&lt;del&gt;LU-4002&lt;/del&gt;&lt;/a&gt;, the layout locks are still in client cache after dbench run... Xiong, do you have any idea? Thanks.&lt;/p&gt;</comment>
                            <comment id="68657" author="bfaccini" created="Wed, 9 Oct 2013 08:51:44 +0000"  >&lt;p&gt;Niu, thanks for all these details! My understanding of the ACTIVE/USE meaning is that they indicate the number/% of non-freed objects per slab type.&lt;/p&gt;

&lt;p&gt;Andreas, according that it is now confirmed that all of these allocs are only kept due to caching, and can be reclaimed either upon need or unconditionally (again &quot;echo 3 &amp;gt; /proc/sys/vm/drop_caches&quot; allows to free almost anything), could we at least downgrade this ticket&apos;s priority, or do you still consider it as critical ?&lt;/p&gt;</comment>
                            <comment id="68678" author="jay" created="Wed, 9 Oct 2013 17:19:08 +0000"  >&lt;p&gt;I think we need to figure out why layout lock is still in cache even after the patch &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4002&quot; title=&quot;HSM restore vs unlink deadlock &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4002&quot;&gt;&lt;del&gt;LU-4002&lt;/del&gt;&lt;/a&gt; is reverted.&lt;/p&gt;</comment>
                            <comment id="68679" author="jay" created="Wed, 9 Oct 2013 17:19:54 +0000"  >&lt;p&gt;I&apos;m totally fine with the slab object remaining in cache because that is linux system behavior and no harm.&lt;/p&gt;</comment>
                            <comment id="68692" author="adilger" created="Wed, 9 Oct 2013 18:47:52 +0000"  >&lt;p&gt;I don&apos;t think that the slab objects are just &quot;in the cache&quot;, I think they are actively being referenced by some part of the Lustre (e.g. lu cache or dentry or similar).  If slabs are just being kept around by the kernel, the OBJS number would be high, but the ACTIVE number would be low (excluding some very small number of objects in a per-CPU cache).  As was reported in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3771&quot; title=&quot;stuck 56G of SUnreclaim memory&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3771&quot;&gt;&lt;del&gt;LU-3771&lt;/del&gt;&lt;/a&gt;, there can be a very large amount of memory (56GB of a 64GB client) that is not released even under memory pressure, and it causes real problems for applications.&lt;/p&gt;

&lt;p&gt;It doesn&apos;t make any sense to cache locks or objects for files that have been deleted or dentries that are no longer in memory.  Even if that memory is &lt;em&gt;eventually&lt;/em&gt; freed, there is a real impact to applications and Lustre itself because there can be large amounts of memory wasted that could be better used by something else.&lt;/p&gt;</comment>
                            <comment id="68695" author="adilger" created="Wed, 9 Oct 2013 19:01:59 +0000"  >&lt;p&gt;Regarding the MDS_INODEBITS_LAYOUT locks, the first question to ask is why there are separate LAYOUT locks in the first place?  Unless there are HSM/migration operations on the file, the LAYOUT bit should always be granted along with LOOKUP/UPDATE to avoid extra RPCs being sent.  That would also ensure that the LAYOUT lock would be cancelled along with the file being deleted, unless there was still IO happening on the file.&lt;/p&gt;

&lt;p&gt;Secondly, the dcache cleanup for deleted files needs to be fixed again, since deleting the dentries will also delete the locks.  This was disabled in &lt;a href=&quot;http://review.whamcloud.com/1874&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/1874&lt;/a&gt; but is directly contributing to this problem.  If the dcache locking prevents us from doing the right thing anymore, there is code that was removed in commit 3698e90b9b8 (deathrow for dentries) that could be revived.&lt;/p&gt;</comment>
                            <comment id="68702" author="jay" created="Wed, 9 Oct 2013 20:18:24 +0000"  >&lt;p&gt;Yes, you&apos;re right. The root cause of this issue is that inodes were not removed from cache when the files were being deleted. This is why it has a high active percentage of lustre_inode_cache, and XYZ_object_kmem is just a fallout of this.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Regarding the MDS_INODEBITS_LAYOUT locks, the first question to ask is why there are separate LAYOUT locks in the first place? Unless there are HSM/migration operations on the file, the LAYOUT bit should always be granted along with LOOKUP/UPDATE to avoid extra RPCs being sent. That would also ensure that the LAYOUT lock would be cancelled along with the file being deleted, unless there was still IO happening on the file.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Yes, the LAYOUT lock should be granted with UPDATE/LOOKUP lock; but if the DLM lock may be revoked because permission and timestamps change, then the process doing glimpse will have to enqueue standalone LAYOUT lock before using the layout. This is why there are so many standalone LAYOUT locks.&lt;/p&gt;


&lt;p&gt;The change in ll_ddelete() is connected to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2487&quot; title=&quot;2.2 Client deadlock between ll_md_blocking_ast, sys_close, and sys_open&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2487&quot;&gt;&lt;del&gt;LU-2487&lt;/del&gt;&lt;/a&gt;, so it&apos;s not an option to enable it. deaththrow would be a good way, need to check it out.&lt;/p&gt;

&lt;p&gt;Actually in another ticket, I made a suggestion to add a hint in blocking AST which tells the client why the lock is being canceled. My original intention is to drop page cache only if the layout is changed(not due to false sharing of DLM lock). I realize we can use it here also - we can drop the i_nlink to zero in the ll_md_blocking_ast() function if we know the file is being unlinked.&lt;/p&gt;</comment>
                            <comment id="68726" author="niu" created="Thu, 10 Oct 2013 03:35:50 +0000"  >&lt;blockquote&gt;
&lt;p&gt;The change in ll_ddelete() is connected to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2487&quot; title=&quot;2.2 Client deadlock between ll_md_blocking_ast, sys_close, and sys_open&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2487&quot;&gt;&lt;del&gt;LU-2487&lt;/del&gt;&lt;/a&gt;, so it&apos;s not an option to enable it. deaththrow would be a good way, need to check it out.&lt;br/&gt;
Actually in another ticket, I made a suggestion to add a hint in blocking AST which tells the client why the lock is being canceled. My original intention is to drop page cache only if the layout is changed(not due to false sharing of DLM lock). I realize we can use it here also - we can drop the i_nlink to zero in the ll_md_blocking_ast() function if we know the file is being unlinked.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;ll_d_iput() will check and clear nlink as well, so I think disable it in ll_ddelete() isn&apos;t a problem. Probably the root cause is that layout lock wasn&apos;t canceled, so the inode nlink can&apos;t be cleared.&lt;/p&gt;</comment>
                            <comment id="68767" author="adilger" created="Thu, 10 Oct 2013 20:15:03 +0000"  >&lt;p&gt;Also, when an object is being destroyed on the OST it should be sending blocking callbacks to the clients with LDLM_FL_DISCARD_DATA, so that should be a sign that the client can immediately drop its cache without any writes.  Could someone please confirm that this is actually happening?&lt;/p&gt;</comment>
                            <comment id="68776" author="jay" created="Thu, 10 Oct 2013 21:46:48 +0000"  >&lt;p&gt;Yes, it&apos;s happening on the client.&lt;/p&gt;

&lt;p&gt;on the server side, the flag LDLM_FL_DISCARD_DATA is set at revoking time.&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;void ldlm_add_bl_work_item(struct ldlm_lock *lock, struct ldlm_lock *&lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;,
                           cfs_list_t *work_list)
{
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; ((lock-&amp;gt;l_flags &amp;amp; LDLM_FL_AST_SENT) == 0) {
                LDLM_DEBUG(lock, &lt;span class=&quot;code-quote&quot;&gt;&quot;lock incompatible; sending blocking AST.&quot;&lt;/span&gt;);
                lock-&amp;gt;l_flags |= LDLM_FL_AST_SENT;
                /* If the enqueuing client said so, tell the AST recipient to
                 * discard dirty data, rather than writing back. */
                &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (&lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;-&amp;gt;l_flags &amp;amp; LDLM_FL_AST_DISCARD_DATA)
                        lock-&amp;gt;l_flags |= LDLM_FL_DISCARD_DATA;
                LASSERT(cfs_list_empty(&amp;amp;lock-&amp;gt;l_bl_ast));
                cfs_list_add(&amp;amp;lock-&amp;gt;l_bl_ast, work_list);
                LDLM_LOCK_GET(lock);
                LASSERT(lock-&amp;gt;l_blocking_lock == NULL);
                lock-&amp;gt;l_blocking_lock = LDLM_LOCK_GET(&lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;);
        }
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;On the client side, this flag will be transferred to ldlm_lock in ldlm_callback_handler(), here it is:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        &lt;span class=&quot;code-comment&quot;&gt;/* Copy hints/flags (e.g. LDLM_FL_DISCARD_DATA) from AST. */&lt;/span&gt;
        lock_res_and_lock(lock);
        lock-&amp;gt;l_flags |= ldlm_flags_from_wire(dlm_req-&amp;gt;lock_flags &amp;amp;
                                              LDLM_AST_FLAGS);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;When the lock is canceled on the client side, it takes this into account in osc_lock_cancel():&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (dlmlock != NULL) {
                &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; do_cancel;
                        
                discard = !!(dlmlock-&amp;gt;l_flags &amp;amp; LDLM_FL_DISCARD_DATA);
                &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (olck-&amp;gt;ols_state &amp;gt;= OLS_GRANTED)
                        result = osc_lock_flush(olck, discard);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And the in osc_cache_writeback_range(), it will discard all of the pages.&lt;/p&gt;

&lt;p&gt;Do you find any problems from your experiment?&lt;/p&gt;</comment>
                            <comment id="68779" author="jay" created="Fri, 11 Oct 2013 00:25:15 +0000"  >&lt;p&gt;Hi Niu, Are you working on this? If not, I can start to work on it.&lt;/p&gt;</comment>
                            <comment id="68791" author="niu" created="Fri, 11 Oct 2013 01:50:13 +0000"  >&lt;p&gt;Xiong, Yes I was trying to find out the reason of why layout lock wasn&apos;t canceled, but didn&apos;t get the root cause so far. I&apos;m glad that you can take this over if you have time. &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; Thank you.&lt;/p&gt;</comment>
                            <comment id="68800" author="adilger" created="Fri, 11 Oct 2013 07:34:52 +0000"  >&lt;p&gt;Jinshan, I haven&apos;t checked whether LDLM_FL_DISCARD_DATA is &lt;em&gt;actually&lt;/em&gt; working.  Even though it works in theory, it hasn&apos;t been tested in a long time because sanity.sh test_42b has been disabled forever because of race conditions in the test (sometimes page writeback can happen during the test).  It would be nice if we had a more robust test case for this, or maybe just change it to be error_ignore?&lt;/p&gt;</comment>
                            <comment id="68865" author="jay" created="Fri, 11 Oct 2013 21:38:09 +0000"  >&lt;p&gt;Actually 42b is easy to fix after osc_extent is implemented because osc extents won&apos;t be allowed to flush unless one sys write ends or it has collected enough pages. We can create a fail_loc to delay the write path and unlink the file in the meanwhile.&lt;/p&gt;

&lt;p&gt;Niu, can you please work this out while I&apos;m looking at the layout lock issue?&lt;/p&gt;</comment>
                            <comment id="68868" author="jay" created="Fri, 11 Oct 2013 23:12:45 +0000"  >&lt;p&gt;From these two lines:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;ldlm.namespaces.mdt-testfs-MDT0000_UUID.resource_count=39654&lt;br/&gt;
ldlm.namespaces.testfs-MDT0000-mdc-ffff8800a66c1c00.resource_count=39654&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;The locks are actually not canceled from the MDT side. I will drill down to find the source.&lt;/p&gt;</comment>
                            <comment id="68869" author="jay" created="Fri, 11 Oct 2013 23:47:58 +0000"  >&lt;p&gt;These are open-unlinked files. After file is unlinked, file IO will cause layout lock to be created. Those locks will never be destroyed. We can fix this issue by acquiring FULL ibits locks at last close for unlinked files.&lt;/p&gt;</comment>
                            <comment id="68892" author="niu" created="Mon, 14 Oct 2013 04:04:33 +0000"  >&lt;blockquote&gt;
&lt;p&gt;These are open-unlinked files. After file is unlinked, file IO will cause layout lock to be created. Those locks will never be destroyed. We can fix this issue by acquiring FULL ibits locks at last close for unlinked files.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;open-unlinked file is a problem, but I don&apos;t think there are so many open-unlinked files in the dbench test. The major problem looks like: Client issue unlink, object on MDT was unlink, but the objects on OSTs would be kept for a short time (until osp sync unlink rec), so dirty data was still cached on client and nlink wasn&apos;t cleared (extent lock cached). If a data flush (ll_writepages()) triggered before the extent lock revoked, client will acquire layout lock again, and this layout lock will never be canceled.&lt;/p&gt;</comment>
                            <comment id="68917" author="jay" created="Mon, 14 Oct 2013 17:41:27 +0000"  >&lt;p&gt;here is the patch: &lt;a href=&quot;http://review.whamcloud.com/7942&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7942&lt;/a&gt;, however the fix is not complete.&lt;/p&gt;

&lt;p&gt;Niu, yes, that is correct. I discovered that problem also and worked out a patch to fix it, but it still has a few problems - if a file is cached on one client which is then deleted from another client, there is no way of taking this file out of cache. We need to start a dedicated kernel thread for this purpose. If you&apos;re still working on this, please take it over to avoid duplicate work.&lt;/p&gt;</comment>
                            <comment id="68966" author="niu" created="Tue, 15 Oct 2013 06:16:29 +0000"  >&lt;blockquote&gt;
&lt;p&gt;Niu, yes, that is correct. I discovered that problem also and worked out a patch to fix it, but it still has a few problems - if a file is cached on one client which is then deleted from another client, there is no way of taking this file out of cache. We need to start a dedicated kernel thread for this purpose. If you&apos;re still working on this, please take it over to avoid duplicate work.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Actually, I found the problem when I was trying to restore the sanity test_42b. I&apos;m fine to take this over, but I need some time to think about how to fix.&lt;/p&gt;</comment>
                            <comment id="69102" author="niu" created="Wed, 16 Oct 2013 11:06:50 +0000"  >&lt;p&gt;Despite of the problme of how to cleanup inode cache, the extra layout lock on dirty flush reminds me that the fix of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3160&quot; title=&quot;recovery-random-scale test_fail_client_mds: RIP: cl_object_top+0xe/0x150 [obdclass]&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3160&quot;&gt;&lt;del&gt;LU-3160&lt;/del&gt;&lt;/a&gt; isn&apos;t proper, I think we&apos;d fix &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3160&quot; title=&quot;recovery-random-scale test_fail_client_mds: RIP: cl_object_top+0xe/0x150 [obdclass]&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3160&quot;&gt;&lt;del&gt;LU-3160&lt;/del&gt;&lt;/a&gt; in another way instead of acquiring layout on dirty flush: &lt;a href=&quot;http://review.whamcloud.com/7957&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7957&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="69144" author="jay" created="Wed, 16 Oct 2013 17:22:25 +0000"  >&lt;p&gt;Hi Niu, with my patch at 7942 by not granting layout lock if the file is not existing at the MDT, it doesn&apos;t have layout lock problem any more. Let me know if I missed something, thanks.&lt;/p&gt;</comment>
                            <comment id="69190" author="niu" created="Thu, 17 Oct 2013 03:10:25 +0000"  >&lt;blockquote&gt;
&lt;p&gt;Hi Niu, with my patch at 7942 by not granting layout lock if the file is not existing at the MDT, it doesn&apos;t have layout lock problem any more. Let me know if I missed something, thanks.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;I think that&apos;s another problem should be fixed (it&apos;s a regression actually, we did check if object exists when we were using getattr function to handle layout intent) , the reason I want to revert the fix of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3160&quot; title=&quot;recovery-random-scale test_fail_client_mds: RIP: cl_object_top+0xe/0x150 [obdclass]&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3160&quot;&gt;&lt;del&gt;LU-3160&lt;/del&gt;&lt;/a&gt; is:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;For dirty flush, client should not try to acquire layout lock at all, instead of acquire, then failed with -ENOENT.&lt;/li&gt;
	&lt;li&gt;The fix of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3160&quot; title=&quot;recovery-random-scale test_fail_client_mds: RIP: cl_object_top+0xe/0x150 [obdclass]&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3160&quot;&gt;&lt;del&gt;LU-3160&lt;/del&gt;&lt;/a&gt; added extra code on client (ll_umounting, and -ENOENT checking), they can be removed in the new fix.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="77934" author="adilger" created="Wed, 26 Feb 2014 18:51:08 +0000"  >&lt;p&gt;The patch &lt;a href=&quot;http://review.whamcloud.com/9223&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/9223&lt;/a&gt; for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4357&quot; title=&quot;page allocation failure. mode:0x40 caused by missing __GFP_WAIT flag&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4357&quot;&gt;&lt;del&gt;LU-4357&lt;/del&gt;&lt;/a&gt; to add __GFP_WAIT might also help this situation, because the client and server will begin generating memory pressure during allocation, and hopefully flush some of the stale objects from the slabs.  I&apos;m not sure that in itself is enough.&lt;/p&gt;

&lt;p&gt;I think this needs to be retested once 9223 lands to see what the current state of affairs is.&lt;/p&gt;</comment>
                            <comment id="80391" author="spitzcor" created="Thu, 27 Mar 2014 17:31:30 +0000"  >&lt;p&gt;Any update since change #9223 landed?&lt;/p&gt;</comment>
                            <comment id="90133" author="niu" created="Mon, 28 Jul 2014 04:03:24 +0000"  >&lt;p&gt;I split Jinshan&apos;s patch: &lt;a href=&quot;http://review.whamcloud.com/11243&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/11243&lt;/a&gt;, this one could solve the problem described in this ticket. (layout lock was fetched by client after file unlinked, which result in the inode cache for the unlinked file can&apos;t be purged).&lt;/p&gt;</comment>
                            <comment id="113624" author="parinay" created="Tue, 28 Apr 2015 15:23:22 +0000"  >&lt;p&gt;Hello Niu, Jinshan,&lt;/p&gt;

&lt;p&gt;I ported the patch - &lt;a href=&quot;http://review.whamcloud.com/#/c/7942/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/7942/&lt;/a&gt; to 2.5.1. I thought it could help fix the problem reported in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2857&quot; title=&quot;2.1.4&amp;lt;-&amp;gt;2.4.0 interop: sanity test_76: FAIL: inode slab grew from 11183 to 12183&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2857&quot;&gt;&lt;del&gt;LU-2857&lt;/del&gt;&lt;/a&gt;. And it did fix the issue reported.&lt;/p&gt;

&lt;p&gt;I read the comments here or the review comments in the patch, and it seem to suggest the patch is incomplete especially following,&lt;/p&gt;

&lt;p&gt;&amp;gt;&amp;gt; &quot; I discovered that problem also and worked out a patch to fix it, but it still has a few problems - if a file is cached on one client which is then        &lt;br/&gt;
&amp;gt;&amp;gt; deleted from another client, there is no way of taking this file out of cache. We need to start a dedicated kernel thread for this purpose.&quot;&lt;/p&gt;


&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Is it possible to split the patch to isolate the issue reported in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2857&quot; title=&quot;2.1.4&amp;lt;-&amp;gt;2.4.0 interop: sanity test_76: FAIL: inode slab grew from 11183 to 12183&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2857&quot;&gt;&lt;del&gt;LU-2857&lt;/del&gt;&lt;/a&gt;? I am not sure if it would be the right approach.&lt;/li&gt;
	&lt;li&gt;Is there any kernel thread started as you mentioned?&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Any help/guidance ?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="20377">LU-3771</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="23600">LU-4754</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="22373">LU-4357</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="16922">LU-2487</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="21211">LU-4033</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="23545">LU-4740</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="21094">LU-3997</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="22622">LU-4429</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="23600">LU-4754</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="21107">LU-4002</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw4ov:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10870</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>