<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:32:47 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-3308] large readdir chunk size slows unlink/&quot;rm -r&quot; performance</title>
                <link>https://jira.whamcloud.com/browse/LU-3308</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Shared directory unlinks using metabench seem significantly slower with Lustre 2.x clients versus Lustre 1.8.x&lt;/p&gt;

&lt;p&gt;512 processes, 8 ppn, 2.1 server&lt;br/&gt;
1.8.6 clients: 18k unlinks/s&lt;br/&gt;
2.3 clients: 7k unlinks/s&lt;/p&gt;

&lt;p&gt;creates are comprable at 28k creates/s&lt;/p&gt;

&lt;p&gt;Mdtest shows no such regression. Digging into metabench a little, it seems that when deleting files, metabench processes (I am told) readdir, select &quot;the next&quot; file, and delete it, effectively racing each other, whereas mdtest deterministically choses the files to delete based on process id.&lt;/p&gt;

&lt;p&gt;This seems to imply it&apos;s directory locking contention on the MDT.&lt;/p&gt;

&lt;p&gt;On the clients, the majority difference in time is spend in ptlrpc_queue_wait (not sure of units):&lt;br/&gt;
1.8.8: 347&lt;br/&gt;
2.3: 923&lt;/p&gt;

&lt;p&gt;On the MDT, the big difference is mdt_object_find_lock:&lt;br/&gt;
1.8.8: 51us&lt;br/&gt;
2.3: 1110us&lt;/p&gt;

&lt;p&gt;Also, using ldlm stats, it seems the 2.3 clients cause twice as many ldlm_bl_callbacks as 1.8.8 clients.&lt;/p&gt;

&lt;p&gt;So apparently the 2.3 client is holding directory locks differently than the 1.8 clients.  We&apos;re still looking into this, but if anyone has thoughts we&apos;d love to hear them. &lt;/p&gt;</description>
                <environment></environment>
        <key id="18787">LU-3308</key>
            <summary>large readdir chunk size slows unlink/&quot;rm -r&quot; performance</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="4" iconUrl="https://jira.whamcloud.com/images/icons/statuses/reopened.png" description="This issue was once resolved, but the resolution was deemed incorrect. From here issues are either marked assigned or resolved.">Reopened</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="nrutman">Nathan Rutman</reporter>
                        <labels>
                            <label>lug23dd</label>
                            <label>medium</label>
                            <label>performance</label>
                    </labels>
                <created>Fri, 10 May 2013 00:03:39 +0000</created>
                <updated>Thu, 1 Feb 2024 00:31:24 +0000</updated>
                                            <version>Lustre 2.1.6</version>
                    <version>Lustre 2.4.1</version>
                    <version>Lustre 2.5.0</version>
                    <version>Lustre 2.12.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>19</watches>
                                                                            <comments>
                            <comment id="58100" author="nrutman" created="Fri, 10 May 2013 00:04:57 +0000"  >&lt;p&gt;Xyratex-bug-id: &lt;a href=&quot;http://jira-nss.xy01.xyratex.com:8080/browse/LELUS-138&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;LELUS-138&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="58106" author="keith" created="Fri, 10 May 2013 01:30:19 +0000"  >&lt;p&gt;Is this the metabench you are referring to? &lt;a href=&quot;http://www.7byte.com/?page=metabench&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://www.7byte.com/?page=metabench&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Have you hand a chance to spin Master lately?&lt;/p&gt;</comment>
                            <comment id="58107" author="spitzcor" created="Fri, 10 May 2013 02:41:20 +0000"  >&lt;p&gt;Keith, the Metabench code we&apos;re using is attached to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1167&quot; title=&quot;Poor mdtest unlink performance with multiple processes per node&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1167&quot;&gt;LU-1167&lt;/a&gt;.  It is not the version from 7byte.com.  Yes, master (well 2.3.63) isn&apos;t any better at all.  Maybe even a little worse &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/sad.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;</comment>
                            <comment id="58121" author="adilger" created="Fri, 10 May 2013 08:14:58 +0000"  >&lt;p&gt;It would be useful to get the performance of 2.1 clients as well, to see how it stacks up.&lt;/p&gt;

&lt;p&gt;The client ptlrpc_queue_wait() is just a side-effect of the client waiting for the server to complete the requests.  It might be useful to gather stats (strace?) to see which operations it is waiting on - unlink() or readdir()&lt;/p&gt;

&lt;p&gt;One area to check is that the 2.3+ clients will do readdir in large chunks compared to earlier versions.  If metabench clients are continually doing readdir and unlinking files, then this might cause a larger amount of data to be sent to all of the clients.  An application might do the readdir() once at the beginning, then delete entries &lt;tt&gt;(N % num_clients + client_no)&lt;/tt&gt; so that it only has to do readdir once.  Looking at the metabench code, it depends on how large the cache in gibc is, since it is calling &lt;tt&gt;readdir()&lt;/tt&gt; for every entry.  It also appears to be doing &lt;tt&gt;lstat(filename)&lt;/tt&gt; on every inode before deleting it, definitely a case of lock ping-pong.&lt;/p&gt;
</comment>
                            <comment id="58158" author="green" created="Fri, 10 May 2013 17:21:29 +0000"  >&lt;p&gt;I wonder if ELC broke and that leads to bigger number of blocking callbacks in 2.3?&lt;br/&gt;
Do you guys have some traces or what sort of analysis did you perform already?&lt;/p&gt;</comment>
                            <comment id="58245" author="spitzcor" created="Sun, 12 May 2013 04:53:18 +0000"  >&lt;p&gt;Yes we do have traces and analysis to share, which will be coming shortly.&lt;/p&gt;

&lt;p&gt;I don&apos;t have 2.1 client performance handy, but I can tell you that 2.2 gives similar results.  In fact, 2.2, 2.3, and master/2.4 (2.3.63) are all in the same neighborhood.  Interestingly enough, removing the lstat() completely didn&apos;t change the rates at all.  I don&apos;t know if that necessarily throws out any theories.&lt;/p&gt;

&lt;p&gt;Also, Nathan noted that single dir unlink scenario in mdtest shows no such regression.  But it is also worth noting that the same Metabench behavior of readdir()+lstat()+unlink() doesn&apos;t show any regression for the multi-dir case.  Would that kind of information rule out the theory on any ELC breakage?&lt;/p&gt;
</comment>
                            <comment id="58298" author="spitzcor" created="Mon, 13 May 2013 18:08:45 +0000"  >&lt;p&gt;We have verified that removing the readdir() + lstat() from Metabench creates mdtest-like results.  Using a 2.2 server (with pdirops) did not correct the regression.  &lt;span class=&quot;error&quot;&gt;&amp;#91;It may have boosted the performance, but we didn&amp;#39;t gather a baseline on the test HW/SW config&amp;#93;&lt;/span&gt;.&lt;/p&gt;

&lt;p&gt;Performance of the readdir&amp;amp;lstat-less Metabench version on 2.3 was on-par to even the Metabench multidir case (server @ 2.2).  Removing the readdir()+lstat() from Metabench had little effect on 1.8.x client runs.&lt;/p&gt;

&lt;p&gt;Previous tests with only lstat removed had little to no effect on 2.x clients, but we plan to quickly introduce it to the readdir-less version to further validate that the regression is caused by the interference of readdir with unlink.&lt;/p&gt;</comment>
                            <comment id="58299" author="bzzz" created="Mon, 13 May 2013 18:19:18 +0000"  >&lt;p&gt;can statahead be involved somehow? I&apos;d suggest to look at rpc stats on all the sides.&lt;/p&gt;</comment>
                            <comment id="58393" author="green" created="Mon, 13 May 2013 20:03:14 +0000"  >&lt;p&gt;Cory:&lt;br/&gt;
Can you please disable stat-ahead and then run two metabench tests: one like the original with everything in place and one with just lstat() removed?&lt;br/&gt;
I suspect that in this configuration removing lstat will be helpful and further highlight ELC breakage of some sort.&lt;/p&gt;</comment>
                            <comment id="58547" author="aboyko" created="Wed, 15 May 2013 07:25:20 +0000"  >&lt;p&gt;Xyratex have tested patch with tunable readdir number of pages for data transfer.  Here is the result of running 8 clients, 8ppn and total 64k files:&lt;/p&gt;

&lt;div class=&apos;table-wrap&apos;&gt;
&lt;table class=&apos;confluenceTable&apos;&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;&amp;nbsp;&lt;/th&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;create&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;unlink&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;2.3 client 1-page readdir chunk size&lt;/th&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;14380&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;8938&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;2.3 client 2-page readdir chunk size&lt;/th&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;14339&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;8419&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;2.3 client 32-page readdir chunk size&lt;/th&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;14296&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;6176&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;2.3 client 256-page readdir chunk size (default)&lt;/th&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;14247&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt; 3739&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th class=&apos;confluenceTh&apos;&gt;1.8.8 client&lt;/th&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;15026&lt;/td&gt;
&lt;td class=&apos;confluenceTd&apos;&gt;9819&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;Big readdir chunk size in 2.x client does cause the lock ping-pong and impact the overall unlink performance for shared directory.&lt;/p&gt;</comment>
                            <comment id="58554" author="adilger" created="Wed, 15 May 2013 10:34:21 +0000"  >&lt;p&gt;Alexander, thanks for the data.&lt;/p&gt;

&lt;p&gt;In this case it makes sense for the client or MDS to auto-tune the readdir chunk size based on usage.  If there are readahead directory pages that are being dropped from cache without being accessed, then it makes sense to reduce the readdir chunk size automatically.  Unfortunately, I don&apos;t think there is any way for the kernel to know that &quot;rm -r&quot; or metabench is the one calling readdir() the first time, so it would still make sense to read a full 1MB of readdir data on first access, and only drop the readdir() chunk size on later access.&lt;/p&gt;</comment>
                            <comment id="58591" author="green" created="Wed, 15 May 2013 17:40:52 +0000"  >&lt;p&gt;Ok, so I think the correct approach here is not to reduce the transfer size, but instead we need to use the other idea we dismissed in the past:&lt;/p&gt;

&lt;p&gt;POSIX allows us to retain cached pages for the readdir purposes as long as the filehandle remains open, so if we do that now, we&apos;ll still have all the benefits of fast rm (even faster due to no need to refresh the pages all the time) and at the same time avoid needless page pingpong for readdir.&lt;/p&gt;

&lt;p&gt;In the past this was deemed unimportant since we did not do statahead so extra pingpong was pretty minimal, but now it becomes more important I think.&lt;/p&gt;</comment>
                            <comment id="58600" author="nrutman" created="Wed, 15 May 2013 18:18:59 +0000"  >&lt;p&gt;@Oleg - but don&apos;t the cached readdir pages have to be dropped if other clients are simultaneously deleting files?  Are you saying POSIX allows for operation on stale directory entries?&lt;/p&gt;</comment>
                            <comment id="58611" author="green" created="Wed, 15 May 2013 19:14:37 +0000"  >&lt;p&gt;What I mean is posix allows for pages to remain cached for the application that has the filehandle open, for as long as the filehandle remains open.&lt;br/&gt;
All other users should get their own pages that are fresh at the time of getting.&lt;/p&gt;

&lt;p&gt;So basically now when lockss get cancelled, we are just dropping pages from cache completely.&lt;br/&gt;
What we&apos;ll need to do instead is to remove reference of them from generic mapping (truncate), but don&apos;t drop them completey, instead reference them from all the opened filehandles. Then when doing readding we&apos;ll check if we have such pages referenced from the filehandle before doing a lookup in the main cache, and if both are a no, then do an RPC.&lt;br/&gt;
Pages would get dereferenced on close, so when the last file opener goes away, they&apos;d be freed (obviously will need some other form of truncation in case of tight memory too).&lt;/p&gt;</comment>
                            <comment id="58617" author="adilger" created="Wed, 15 May 2013 20:51:07 +0000"  >&lt;p&gt;The relevant sections in the readdir(2) definition in SUSv2 &lt;a href=&quot;http://pubs.opengroup.org/onlinepubs/007908799/xsh/readdir.html:&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://pubs.opengroup.org/onlinepubs/007908799/xsh/readdir.html:&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;... Directory entries represent files; files may be removed from a directory or added to a directory asynchronously to the operation of readdir().&lt;/p&gt;

&lt;p&gt;If a file is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir() returns an entry for that file is unspecified.&lt;/p&gt;

&lt;p&gt;The readdir() function may buffer several directory entries per actual read operation;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;So, the readdir() processing does NOT need to be kept coherent w.r.t. entries being added or removed to the directory until rewinddir() is called on that stream.&lt;/p&gt;</comment>
                            <comment id="124640" author="adilger" created="Wed, 19 Aug 2015 19:14:26 +0000"  >&lt;p&gt;If this is still an area of interest for someone to fix, to start with the &lt;tt&gt;mdc_obd_max_pages_per_rpc&lt;/tt&gt; lproc entry needs to be fixed to allow it to be written (shrunk at least) so that &lt;tt&gt;lctl set_param mdc.*.max_pages_per_rpc=N&lt;/tt&gt; is allowed.  See comment in &lt;tt&gt;mdc/lproc_mdc.c&lt;/tt&gt;:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        /*                           
         * FIXME: below proc entry is provided, but not in used, instead
         * sbi-&amp;gt;sb_md_brw_size is used, the per obd variable should be used
         * when CMD is enabled, and dir pages are managed in MDC layer.
         * Remember to enable proc write function.
         */
        { .name =       &lt;span class=&quot;code-quote&quot;&gt;&quot;max_pages_per_rpc&quot;&lt;/span&gt;,
          .fops =       &amp;amp;mdc_obd_max_pages_per_rpc_fops },
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;At that point it would be easy to do a few benchmarks on a decent sized filesystem (a few million files with a variety of directory sizes, including some larger dirs @ 10K+ entries) to see what effect the readdir BRW size has on real-world readdir performance.  Clearly there will be some advantage for readdir of larger directories, but if this is outweighed by the performance loss for unlinks in a shared directory then it might be possible to have a reduced readdir size by default (e.g. 32KB or 64KB) that still gives decent readdir performance but reduces overhead for unlinks.&lt;/p&gt;</comment>
                            <comment id="188928" author="gerrit" created="Mon, 20 Mar 2017 07:50:49 +0000"  >&lt;p&gt;Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/26088&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/26088&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3308&quot; title=&quot;large readdir chunk size slows unlink/&amp;quot;rm -r&amp;quot; performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3308&quot;&gt;LU-3308&lt;/a&gt; mdc: allow setting readdir RPC size parameter&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: dd8181ad51af55f7317e7087adb1cad33338158f&lt;/p&gt;</comment>
                            <comment id="189789" author="gerrit" created="Mon, 27 Mar 2017 20:15:45 +0000"  >&lt;p&gt;Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/26212&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/26212&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3308&quot; title=&quot;large readdir chunk size slows unlink/&amp;quot;rm -r&amp;quot; performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3308&quot;&gt;LU-3308&lt;/a&gt; tests: fix sanity/sanityn test_mkdir() usage&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 6c5270804141c1c4b48be8d357a4686f42c70e18&lt;/p&gt;</comment>
                            <comment id="194107" author="gerrit" created="Tue, 2 May 2017 01:57:49 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/26088/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/26088/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3308&quot; title=&quot;large readdir chunk size slows unlink/&amp;quot;rm -r&amp;quot; performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3308&quot;&gt;LU-3308&lt;/a&gt; mdc: allow setting readdir RPC size parameter&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 664bad91b5c0e2a5c9423d4b524d9906db9ef9b5&lt;/p&gt;</comment>
                            <comment id="208998" author="gerrit" created="Thu, 21 Sep 2017 06:13:03 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/26212/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/26212/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3308&quot; title=&quot;large readdir chunk size slows unlink/&amp;quot;rm -r&amp;quot; performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3308&quot;&gt;LU-3308&lt;/a&gt; tests: fix sanity/sanityn test_mkdir() usage&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: c75aa6c74cd86c69de893395735a6725571f11f4&lt;/p&gt;</comment>
                            <comment id="209029" author="pjones" created="Thu, 21 Sep 2017 12:16:55 +0000"  >&lt;p&gt;Landed for 2.11&lt;/p&gt;</comment>
                            <comment id="209090" author="adilger" created="Thu, 21 Sep 2017 16:52:18 +0000"  >&lt;p&gt;This is not actually fixed yet, just minor cleanup patches landed. &lt;/p&gt;</comment>
                            <comment id="209168" author="bzzz" created="Fri, 22 Sep 2017 01:35:20 +0000"  >&lt;p&gt;with c75aa6c74cd86c69de893395735a6725571f11f4 landed  I&apos;m getting the following in the local testing:&lt;/p&gt;

&lt;p&gt;== sanity test 102d: tar restore stripe info from tarfile,not keep osts == 08:27:52 (1506058072)&lt;br/&gt;
mkdir: cannot create directory `/mnt/lustre/d102d.sanity&apos;: File exists&lt;br/&gt;
 sanity test_102d: @@@@@@ FAIL: mkdir &apos;/mnt/lustre/d102d.sanity&apos; failed &lt;/p&gt;

&lt;p&gt;setup_test102() creates d102d.sanity, then test_102d() tries to create it too&lt;/p&gt;</comment>
                            <comment id="226431" author="adilger" created="Thu, 19 Apr 2018 22:08:24 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=nrutman&quot; class=&quot;user-hover&quot; rel=&quot;nrutman&quot;&gt;nrutman&lt;/a&gt;, &lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=aboyko&quot; class=&quot;user-hover&quot; rel=&quot;aboyko&quot;&gt;aboyko&lt;/a&gt;, now that patch &lt;a href=&quot;https://review.whamcloud.com/26088/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/26088/&lt;/a&gt; &quot;&lt;tt&gt;mdc: allow setting readdir RPC size parameter&lt;/tt&gt;&quot; is landed, could someone at Cray run some testing to see what a better readdir RPC size is for the original workload?  From &lt;a href=&quot;#comment-58547&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;comment-58547&lt;/a&gt; it looks to be somewhere between 2 and 32 pages by default.  We could start by reducing the default RPC size to something that gives good performance out of the box for most workloads.&lt;/p&gt;

&lt;p&gt;Separately, getting a patch to continue caching &lt;tt&gt;readdir()&lt;/tt&gt; data on the client fd after lock cancellation (per &lt;a href=&quot;#comment-58611&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;comment-58611&lt;/a&gt; and few replies) until &lt;tt&gt;close()&lt;/tt&gt; or &lt;tt&gt;rewinddir()&lt;/tt&gt; (I believe this ends up as &lt;tt&gt;seek(0)&lt;/tt&gt; in the kernel) would allow even better client performance, since it would not need to re-fetch the directory for every deleted file.&lt;/p&gt;</comment>
                            <comment id="227949" author="spitzcor" created="Wed, 16 May 2018 05:06:46 +0000"  >&lt;p&gt;I will ask&#160;Eugene if he can take a look.&lt;/p&gt;</comment>
                            <comment id="248560" author="nrutman" created="Thu, 6 Jun 2019 16:39:27 +0000"  >&lt;p&gt;See also &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3240&quot; title=&quot;The link count is not updated after the mkdir&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3240&quot;&gt;&lt;del&gt;LU-3240&lt;/del&gt;&lt;/a&gt;, esp. patch&#160;&lt;a href=&quot;https://review.whamcloud.com/#/c/7909/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/7909/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="255250" author="adilger" created="Mon, 23 Sep 2019 10:35:40 +0000"  >&lt;p&gt;Zam, this is the ticket discussed today that describes possible optimizations for single-threaded &quot;&lt;tt&gt;rm -r&lt;/tt&gt;&quot; and other workloads that are doing mixed &lt;tt&gt;readdir() + unlink()&lt;/tt&gt;.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="25224">LU-5232</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="24209">LU-4906</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="21382">LU-4096</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="49233">LU-10225</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="14521">LU-1431</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="25224">LU-5232</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="68542">LU-15535</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="13418">LU-1167</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="45890">LU-9458</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="52118">LU-10999</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="52119">LU-11000</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="80489">LU-17493</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="40068">LU-8641</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvqlz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>8195</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>