<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:16:55 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-8365] Fix mballoc stream allocator to better use free space at start of drive</title>
                <link>https://jira.whamcloud.com/browse/LU-8365</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Provide a mechanism to reset the ldiskfs extents allocation position to near the beginning of a drive&lt;/p&gt;</description>
                <environment></environment>
        <key id="37967">LU-8365</key>
            <summary>Fix mballoc stream allocator to better use free space at start of drive</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="ys">Yang Sheng</assignee>
                                    <reporter username="lokesh.jaliminche">Lokesh Nagappa Jaliminche</reporter>
                        <labels>
                            <label>ldiskfs</label>
                    </labels>
                <created>Mon, 4 Jul 2016 14:34:51 +0000</created>
                <updated>Tue, 10 Aug 2021 19:20:49 +0000</updated>
                                            <version>Lustre 2.8.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>14</watches>
                                                                            <comments>
                            <comment id="157604" author="gerrit" created="Mon, 4 Jul 2016 14:39:07 +0000"  >&lt;p&gt;lokesh.jaliminche (lokesh.jaliminche@seagate.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/21142&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/21142&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8365&quot; title=&quot;Fix mballoc stream allocator to better use free space at start of drive&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8365&quot;&gt;LU-8365&lt;/a&gt; ldiskfs: procfs entries for mballoc&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 7218c37a694df7b0f057a6078dad24a9166300e7&lt;/p&gt;</comment>
                            <comment id="166297" author="adilger" created="Sat, 17 Sep 2016 09:29:54 +0000"  >&lt;p&gt;This patch exposes that mballoc is not doing as good a job in group selection for empty HDDs as it might. Biasing allocations to the start of the disk can improve performance, but only if the start of the disk has free space.&lt;/p&gt;

&lt;p&gt;Some possibilities to try that may actually fix mballoc, in order of increasing difficulty:&lt;br/&gt;
 1) when freeing blocks (in the group descriptor update) if the group changes to less than X allocated blocks, and the group is below &lt;tt&gt;mb_last_group&lt;/tt&gt; then reset &lt;tt&gt;mb_last_group&lt;/tt&gt; to that group. In cases where a filesystem is being filled and emptied (e.g. benchmarks) this would automatically produce optimal results, without the need for this tunable at all. It would also work for all users. The threshold X should be larger than the count of bitmaps and inode table in a normal group. &lt;br/&gt;
 2) As an added heuristic to the above, only go back if there are M consecutive groups that meet this criteria, so that we don&apos;t keep seeking back to one group that has only 2k allocated blocks and then scanning to the end of the used groups again. The number of groups (M) could be a tunable. &lt;br/&gt;
 3) As an added heuristic to the above, save the old &lt;tt&gt;mb_last_group&lt;/tt&gt; and return there if the current and next few groups are &quot;full&quot; (if &lt;tt&gt;(mb_last_group &amp;gt; current group)&lt;/tt&gt;), otherwise scan forward as usual. That avoids scanning a lot of potentially useless groups that have recently been scanned in order to get to where &lt;tt&gt;mb_last_group&lt;/tt&gt; was previously. &lt;br/&gt;
 4) as above, but using free chunk size instead of just free block count (e.g. use maximum free chunk size in the buddy bitmap). This could add in a few level(s) of free chunks, maybe above some threshold like 8MB with a simple shift+add &lt;tt&gt;hweight()&lt;/tt&gt; loop while scanning the buddy bitmap.&lt;br/&gt;
 5) instead of using &lt;tt&gt;mb_last_group&lt;/tt&gt; at all, keep a &quot;sorted&quot; list or a tree of &lt;tt&gt;struct ext4_group_info&lt;/tt&gt; that track free space in groups. The sorting comparison should prefer groups at the start of the disk in some way (eg. &lt;tt&gt;free_blocks - group_number/128&lt;/tt&gt;). Populate this list/tree during mount-time group descriptor scan and keep it &quot;sorted&quot; as blocks are allocated and freed. Sorting can be lazy to avoid lots of rebalancing. Use the list/tree to find a good new target group if the current and next group are &quot;full&quot;. This will also improve performance when the filesystem becomes nearly full, to avoid lots of scanning for groups with free blocks. &lt;br/&gt;
 6) as above, but use separate list/trees based on fullness. No need to keep groups in sorted lists/trees if they are totally empty. No need to manage groups that are nearly full until enough blocks are freed to make them interesting allocation targets. It may be that this is _more_complex than a single list/tree, but it depends on how much the ongoing tree balancing costs. If the cost is high, and groups usually change from &quot;mostly empty&quot; to &quot;mostly full&quot; then having a &quot;full&quot; list to keep groups until they become &quot;nearly empty&quot; again would be useful.&lt;/p&gt;</comment>
                            <comment id="166705" author="lokesh.jaliminche" created="Wed, 21 Sep 2016 13:45:39 +0000"  >&lt;p&gt;Thanks for the details, working on it.&lt;/p&gt;</comment>
                            <comment id="233563" author="adilger" created="Sat, 15 Sep 2018 00:18:31 +0000"  >&lt;p&gt;Hi Yang Sheng,&lt;br/&gt;
would you be able to work on a patch for this issue?  I think the first 3-4 steps in the proposed solution shouldn&apos;t be too hard to implement.  It looks like &lt;tt&gt;mb_set_largest_free_order()&lt;/tt&gt; or in its callers might be the right place to see if &lt;tt&gt;s_mb_last_group&lt;/tt&gt; should be updated?  We already track &lt;tt&gt;bb_free&lt;/tt&gt; and &lt;tt&gt;bb_largest_free_order&lt;/tt&gt; for each group, so they can be used to decide if we should reset &lt;tt&gt;s_mb_last_group&lt;/tt&gt; or not.  It would be good to save the old &lt;tt&gt;s_mb_last_group&lt;/tt&gt; to return if scanning in &lt;tt&gt;ldiskfs_mb_regular_allocator()&lt;/tt&gt; doesn&apos;t find any good groups quickly.&lt;/p&gt;

&lt;p&gt;Please feel free to ask if you have questions.  I&apos;d like to have something to look at late next week, if possible.  We need to run some benchmarks on real hardware to ensure this is doing the right thing.  It would be OK to include the patch &lt;a href=&quot;https://review.whamcloud.com/21142&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/21142&lt;/a&gt; for testing/debugging, but I don&apos;t consider that a real fix for this issue.&lt;/p&gt;



&lt;p&gt;In a semi-related area, I also noticed in the current &lt;tt&gt;ext4-prealloc.patch&lt;/tt&gt; while looking at this issue that there is a bug in the code:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
        &lt;span class=&quot;code-comment&quot;&gt;/* don&apos;t use group allocation &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; large files */&lt;/span&gt;
        size = max(size, isize);
+       &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; ((ac-&amp;gt;ac_o_ex.fe_len &amp;gt;= sbi-&amp;gt;s_mb_small_req) ||
+           (size &amp;gt;= sbi-&amp;gt;s_mb_large_req)) {
                ac-&amp;gt;ac_flags |= EXT4_MB_STREAM_ALLOC;
                &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt;;
        }
 
+       /*
+        * request is so large that we don&apos;t care about
+        * streaming - it overweights any possible seek
+        */
+       &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (ac-&amp;gt;ac_o_ex.fe_len &amp;gt;= sbi-&amp;gt;s_mb_large_req)
+               &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt;;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;It looks like we can never get to the second condition because &lt;tt&gt;fe_len &amp;gt; s_mb_small_req&lt;/tt&gt; will always be true first.  This has been true all the way back to the original version of this patch (commit &lt;tt&gt;&lt;a href=&quot;https://git.whamcloud.com/?p=fs/lustre-release.git;a=commit;h=d8d8fd9192a54c7b8caef8cca9b7a1eb5e5e3298&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;d8d8fd9192a5&lt;/a&gt;&lt;/tt&gt;.  It seems like the &lt;tt&gt;s_mb_large_req&lt;/tt&gt; check should be moved before &lt;tt&gt;EXT4_MB_STREAM_ALLOC&lt;/tt&gt; is set, so that it allows large allocations to behave differently?&lt;/p&gt;</comment>
                            <comment id="233665" author="ys" created="Tue, 18 Sep 2018 05:23:23 +0000"  >&lt;p&gt;Hi, Alex,&lt;/p&gt;

&lt;p&gt;Looks like the &apos;stream allocation&apos; has been changed since upstream patch(4ba74d00a2025). Could you please review it whether correct for original purpose. Other question is why we need s_mb_small_req? As i understand, &apos;stream allocation&apos; would be used while request size less than s_mb_large_req. Then what purpose is s_mb_small_req? Could you give a point for that please?&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
YangSheng&lt;/p&gt;</comment>
                            <comment id="233685" author="gerrit" created="Tue, 18 Sep 2018 14:48:55 +0000"  >&lt;p&gt;Yang Sheng (ys@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/33195&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33195&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8365&quot; title=&quot;Fix mballoc stream allocator to better use free space at start of drive&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8365&quot;&gt;LU-8365&lt;/a&gt; ldiskfs: try to alloc block toward lower sectors&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 7e6352641724a26685247a21be488e43401437eb&lt;/p&gt;</comment>
                            <comment id="234067" author="ys" created="Thu, 27 Sep 2018 14:07:39 +0000"  >&lt;p&gt;Hi, Alex, &lt;/p&gt;

&lt;p&gt;Could you please give a advice for this patch?   &lt;span class=&quot;nobr&quot;&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/attachment/31104/31104_0001-ext4-Fix-bugs-in-mballoc-s-stream-allocation-mode.patch&quot; title=&quot;0001-ext4-Fix-bugs-in-mballoc-s-stream-allocation-mode.patch attached to LU-8365&quot;&gt;0001-ext4-Fix-bugs-in-mballoc-s-stream-allocation-mode.patch&lt;sup&gt;&lt;img class=&quot;rendericon&quot; src=&quot;https://jira.whamcloud.com/images/icons/link_attachment_7.gif&quot; height=&quot;7&quot; width=&quot;7&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/sup&gt;&lt;/a&gt;&lt;/span&gt; &lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
YangSheng&lt;/p&gt;</comment>
                            <comment id="234093" author="adilger" created="Fri, 28 Sep 2018 08:55:14 +0000"  >&lt;p&gt;I&apos;m not sure why you attached the patch here?  That is what gerrit is for. &lt;/p&gt;</comment>
                            <comment id="234094" author="ys" created="Fri, 28 Sep 2018 09:39:49 +0000"  >&lt;p&gt;Hi, Andreas,&lt;/p&gt;

&lt;p&gt;This is the patch has already landed to upstream. I just want to get some input from Alex whether it is correct for stream allocation. Since it changes logic of stream allocation.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
YangSheng&lt;/p&gt;</comment>
                            <comment id="236141" author="gerrit" created="Thu, 1 Nov 2018 13:18:05 +0000"  >&lt;p&gt;Yang Sheng (ys@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/33548&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33548&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8365&quot; title=&quot;Fix mballoc stream allocator to better use free space at start of drive&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8365&quot;&gt;LU-8365&lt;/a&gt; ldiskfs: fix wrong logic of stream allocation&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 52116e1483cb24e3f5d0c2a3f20fa989dbd75e63&lt;/p&gt;</comment>
                            <comment id="242794" author="zam" created="Tue, 26 Feb 2019 09:30:02 +0000"  >&lt;p&gt;are there any performance tests for this patch  &lt;a href=&quot;https://review.whamcloud.com/33195&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33195&lt;/a&gt; ?&lt;/p&gt;</comment>
                            <comment id="242796" author="adilger" created="Tue, 26 Feb 2019 09:52:48 +0000"  >&lt;p&gt;Ihara had started running some tests on the patch, but I don&apos;t recall ever seeing the results.&lt;/p&gt;

&lt;p&gt;The main goal was to automate the original &quot;manually reset to the start of the disk during benchmarking&quot; behavior under normal usage.  In particular, jump back to earlier groups when a bunch of free space becomes available, without having to continually scan the earlier groups for free space.  The potential drawback is if this happens too frequently it could cause excessive seeking, but since it should only happen when there is a large amount of space any seek overhead should be smaller than the seek rate * IO size.&lt;/p&gt;</comment>
                            <comment id="243237" author="gerrit" created="Sun, 3 Mar 2019 00:20:23 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/21142/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/21142/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8365&quot; title=&quot;Fix mballoc stream allocator to better use free space at start of drive&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8365&quot;&gt;LU-8365&lt;/a&gt; ldiskfs: procfs entries for mballoc&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 75703118588f2b23afd8c8815e5ebb768fc7a8ff&lt;/p&gt;</comment>
                            <comment id="246922" author="gerrit" created="Fri, 10 May 2019 00:38:49 +0000"  >&lt;p&gt;Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/34842&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34842&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8365&quot; title=&quot;Fix mballoc stream allocator to better use free space at start of drive&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8365&quot;&gt;LU-8365&lt;/a&gt; ldiskfs: procfs entries for mballoc&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 1afff0d0a40ddf3c413c1db5a21d8d46da61e1c2&lt;/p&gt;</comment>
                            <comment id="250571" author="gerrit" created="Wed, 3 Jul 2019 03:25:06 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/34842/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34842/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8365&quot; title=&quot;Fix mballoc stream allocator to better use free space at start of drive&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8365&quot;&gt;LU-8365&lt;/a&gt; ldiskfs: procfs entries for mballoc&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: ea7103b0b1c360b0e6d7fe62e275df366bf4e31d&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="16750">LU-2377</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="57389">LU-12970</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="62900">LU-14438</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="55236">LU-12103</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="31104" name="0001-ext4-Fix-bugs-in-mballoc-s-stream-allocation-mode.patch" size="3768" author="ys" created="Thu, 27 Sep 2018 14:07:04 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzygjr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>