<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:00:38 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-13363] unbalanced round-robin for object allocation in OST pool</title>
                <link>https://jira.whamcloud.com/browse/LU-13363</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Here is an example. create two OST pools with 12 OSTs. pool &apos;nvme&apos; consists of OST index&lt;span class=&quot;error&quot;&gt;&amp;#91;0-7&amp;#93;&lt;/span&gt;.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl pool_new scratch.nvme
lctl pool_new scratch.hdd
lctl pool_add scratch.nvme OST[0-7]
lctl pool_add scratch.hdd OST[8-b]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;If an client creates 48 files (new files) into an directory which is associated with 8 OSTs by OST pool, it would expect 6 OST objects per OST, but results was totally unbalanced.&lt;br/&gt;
 Test was repeated 5 times, and here is a result how many OST objects allocated to each OST in each test.&lt;/p&gt;

&lt;p&gt;Used 8 of 12 OSTs with an OST pool&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;      ost index 
    0  1 2 3 4 5  6  7
t1. 4 10 3 8 5 6  8  4
t2. 6  5 6 7 8 4 10  2
t3. 3 10 8 6 5 9  6  1
t4. 4 10 6 5 4 6  8  5
t5. 6  6 7 4 6 5  8  6
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;If filesystem created on just 8 OSTs and no OST pool, OST objects were allocated to across 8 OSTs in an balanced and round-robin worked perfectly.&lt;/p&gt;

&lt;p&gt;Just 8 OST without OST pool&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;      ost index 
    0 1 2 3 4 5 6 7
t1. 6 6 6 6 6 6 6 6
t2. 6 6 6 6 6 6 6 6
t3. 6 6 6 6 6 6 6 6
t4. 6 6 6 6 6 6 6 6
t5. 6 6 6 6 6 6 6 6
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>Two OST pools with two different sizes of OSTs within the same filesystem</environment>
        <key id="58389">LU-13363</key>
            <summary>unbalanced round-robin for object allocation in OST pool</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bzzz">Alex Zhuravlev</assignee>
                                    <reporter username="sihara">Shuichi Ihara</reporter>
                        <labels>
                    </labels>
                <created>Tue, 17 Mar 2020 01:13:03 +0000</created>
                <updated>Thu, 16 Jun 2022 20:25:58 +0000</updated>
                            <resolved>Mon, 6 Jun 2022 13:29:32 +0000</resolved>
                                    <version>Lustre 2.14.0</version>
                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>9</watches>
                                                                            <comments>
                            <comment id="265380" author="adilger" created="Tue, 17 Mar 2020 01:28:42 +0000"  >&lt;p&gt;Presumably the OST0008-OST000B size is much different than OST0000-OST0007?  It might be that the pool allocation is incorrectly using QOS because the global OST imbalance, even though the OSTs within the pool are still balanced.  If you configure with only the NVMe OST0000-OST0007, but create the pool on only 6 of them, is the allocation balanced?&lt;/p&gt;</comment>
                            <comment id="265382" author="adilger" created="Tue, 17 Mar 2020 01:31:08 +0000"  >&lt;p&gt;It looks like this may be a duplicate of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13066&quot; title=&quot;RR vs. QOS allocator should be tracked per OST pool&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13066&quot;&gt;&lt;del&gt;LU-13066&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="265385" author="sihara" created="Tue, 17 Mar 2020 01:56:44 +0000"  >&lt;p&gt;Yes, if all OSTs are same capacity and created an OST pool from few OSTs, it&apos;s balanced very well. if different capacity of OST are mixed in filesystem, it causes problem even it creates OST pool on same capacity of devices.&lt;/p&gt;</comment>
                            <comment id="265387" author="adilger" created="Tue, 17 Mar 2020 03:15:59 +0000"  >&lt;p&gt;Notes for fixing this issue from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13066&quot; title=&quot;RR vs. QOS allocator should be tracked per OST pool&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13066&quot;&gt;&lt;del&gt;LU-13066&lt;/del&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The &lt;tt&gt;ltd-&amp;gt;ltd_qos.lq_same_space&lt;/tt&gt; boolean that decides whether the LOD QOS allocator is active for an allocation or not is tracked for the entire LOV, when it should actually be tracked on a per-pool basis.&lt;/p&gt;

&lt;p&gt;Consider the case where there are SSD of 1TB in size (in an &lt;tt&gt;ssd&lt;/tt&gt; pool), and HDD OSTs of 100TB in size (in an &lt;tt&gt;hdd&lt;/tt&gt; pool). In a  newly-formatted filesystem, it is clear that the SSD OSTs would have 1% of the free space of the HDD OSTs, and &lt;tt&gt;lq_same_space=0&lt;/tt&gt; is set in &lt;tt&gt;ltd_qos_penalties_calc()&lt;/tt&gt;.  As a result, QOS would always be active and the SSDs would be skipped for virtually all normal (default pool) allocations, unless the &lt;tt&gt;ssd&lt;/tt&gt; pool is specifically requested.  That is fine (even desirable) for the default all-OST pool.&lt;/p&gt;

&lt;p&gt;Now, if an allocation is using either the &lt;tt&gt;ssd&lt;/tt&gt; or &lt;tt&gt;hdd&lt;/tt&gt; pools, &lt;tt&gt;lod_ost_alloc_qos()&lt;/tt&gt; will find the global &lt;tt&gt;lq_same_space=0&lt;/tt&gt; and not use RR allocation, but less-optimal QOS space weighted allocation, even though the space of OSTs in either pool may be well balanced.  Instead, the &lt;tt&gt;lq_same_space&lt;/tt&gt; flag should be kept on &lt;tt&gt;struct lu_tgt_pool&lt;/tt&gt; so that allocations within a given pool can decide for RR or QOS allocation independently of the global pool.&lt;/p&gt;&lt;/blockquote&gt;</comment>
                            <comment id="266466" author="bzzz" created="Tue, 31 Mar 2020 18:10:47 +0000"  >&lt;p&gt;it sounds like each pool needs own lu_qos and all logic should be built around that per-pool structure?&lt;br/&gt;
what if some OST is a member of few pools? or some pool-less allocation hits in-some-pool OSTs?&lt;/p&gt;</comment>
                            <comment id="266475" author="adilger" created="Tue, 31 Mar 2020 21:15:59 +0000"  >&lt;p&gt;There are definitely going to be OSTs in multiple pools, and allocations that are outside pools.  I think there should be common data fields, like OST fullness, that are shared across pools, and other per-pool information that is not shared.&lt;/p&gt;

&lt;p&gt;I don&apos;t think we need to have totally perfect coordination between allocations in two different pools or in a pool and outside the pool.  However, simple decisions like &quot;is this pool within &lt;tt&gt;qos_threshold_rr&lt;/tt&gt;&quot; can be easily checked for all of the OSTs in the pool, regardless of whether the OST is in another pool as well.  If the pool is balanced, then it should just do round-robin allocations within that pool.&lt;/p&gt;</comment>
                            <comment id="266773" author="emoly.liu" created="Fri, 3 Apr 2020 12:47:30 +0000"  >&lt;p&gt;I made a patch to calculate&#160;penalties per-ost in a pool. At first, I tried to add qos structure to pool_desc, similar idea to Alex&apos;s, but finally I found we don&apos;t need that because what we want is just to rebalance data in this pool each time.&lt;/p&gt;

&lt;p&gt;Here is my test on 6 OSTs. pool1 is on OST&lt;span class=&quot;error&quot;&gt;&amp;#91;0-3&amp;#93;&lt;/span&gt; and OST&lt;span class=&quot;error&quot;&gt;&amp;#91;0-3&amp;#93;&lt;/span&gt; have similar available space, as follows. Then, I created 48 files on them.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@centos7-3 tests]# lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
lustre-OST0000_UUID       325368      115908      182300  39% /mnt/lustre[OST:0]
lustre-OST0001_UUID       325368      126152      172056  43% /mnt/lustre[OST:1]
lustre-OST0002_UUID       325368      136388      161820  46% /mnt/lustre[OST:2]
lustre-OST0003_UUID       325368      131276      166932  45% /mnt/lustre[OST:3]
lustre-OST0004_UUID       325368       13512      284696   5% /mnt/lustre[OST:4]
lustre-OST0005_UUID       325368       13516      284692   5% /mnt/lustre[OST:5]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Without the patch, the files distribution is&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;OST0  OST1  OST2  OST3
13    11    14    10
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;With the patch,&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;OST0  OST1  OST2  OST3
12    12    12    12
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I will submit this tentative patch later.&lt;/p&gt;</comment>
                            <comment id="266780" author="bzzz" created="Fri, 3 Apr 2020 13:08:54 +0000"  >&lt;p&gt;I think rebalancing on every allocation is too expensive.&lt;/p&gt;</comment>
                            <comment id="266791" author="gerrit" created="Fri, 3 Apr 2020 16:02:29 +0000"  >&lt;p&gt;Emoly Liu (emoly@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/38136&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/38136&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13363&quot; title=&quot;unbalanced round-robin for object allocation in OST pool&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13363&quot;&gt;&lt;del&gt;LU-13363&lt;/del&gt;&lt;/a&gt; lod: do object allocation in OST pool&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 882ae1e39b68ab0cdee78f7bb4e9152f4778e5b9&lt;/p&gt;</comment>
                            <comment id="336783" author="gerrit" created="Mon, 6 Jun 2022 06:27:16 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/38136/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/38136/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13363&quot; title=&quot;unbalanced round-robin for object allocation in OST pool&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13363&quot;&gt;&lt;del&gt;LU-13363&lt;/del&gt;&lt;/a&gt; lod: do object allocation in OST pool&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: e642e75cde0248eee30ca94aaeb81653db7f8d03&lt;/p&gt;</comment>
                            <comment id="336819" author="pjones" created="Mon, 6 Jun 2022 13:29:32 +0000"  >&lt;p&gt;Landed for 2.16&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="10082">LU-9</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="45222">LU-9392</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="57613">LU-13066</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00vlj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>