<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:29:05 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-16678] QOS improvement</title>
                <link>https://jira.whamcloud.com/browse/LU-16678</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;From Andreas comment in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16501&quot; title=&quot;QOS allocator not balancing space enough&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16501&quot;&gt;&lt;del&gt;LU-16501&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;Patch the weight and penalty calculation to reduce/exclude the blocks or inodes, depending on which one is currently &quot;unimportant&quot;. For example, on OSTs there are typically far more free inodes than space, so the free inodes should not affect the result when calculating the weight. Conversely, on the MDTs there is usually more free space than inodes, so the free space should not affect the weight. However, in some situations (e.g. DoM or Changelogs filling MDT space, or very small objects on OSTs) these values may become important and cannot be ignored completely as in my 49890 patch.&lt;/p&gt;

&lt;p&gt;We cannot change the weight calculation to selectively add/remove the inodes/blocks completely, since that will change the &quot;units&quot; they are calculated in, and it may be more or less important for different OSTs depending on their free usage. I was thinking something along the following lines:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;for each statfs update the following metrics can be calculated once per&#160;&lt;tt&gt;OBD_STATFS&lt;/tt&gt;&#160;call:&lt;/li&gt;
	&lt;li&gt;calculate &quot;filesystem bytes per inode&quot; based on &quot;&lt;tt&gt;tot_bpi = bytes_total / inodes_total&lt;/tt&gt;&quot; (this would match the &quot;inode ratio&quot; when an ldiskfs MDT or OST is formatted). I&apos;m not totally convinced if this is needed, it depends on how the algorithm is implemented.&lt;/li&gt;
	&lt;li&gt;calculate &quot;current bytes per_inode&quot; based on &quot;&lt;tt&gt;cur_bpi = bytes_used / inodes_used&lt;/tt&gt;&quot; to determine how the filesystem is actually being used. For&#160;&lt;tt&gt;osd-zfs&lt;/tt&gt;&#160;the&lt;/li&gt;
	&lt;li&gt;limit the contribution of the free inodes OR free bytes to the weight/penalty calculation based on how current average file size (&lt;tt&gt;cur_bpi&lt;/tt&gt;) compares to the filesystem limits (&lt;tt&gt;tot_bpi&lt;/tt&gt;).&lt;/li&gt;
	&lt;li&gt;it may be that the&#160;&lt;tt&gt;cur_bpi&lt;/tt&gt;&#160;has to be adjusted when the filesystem is initially empty (e.g. because the only files in use are for internal config files and maybe the journal), but it may not be important in the long run unless this significantly&#160;&lt;b&gt;reduces&lt;/b&gt;&#160;the relative weight of new/empty OSTs compared to old/full OSTs (where&#160;&lt;tt&gt;cur_bpi&lt;/tt&gt;&#160;could accurately predict the expected object size). As soon as OST objects start being allocated on the OST the&#160;&lt;tt&gt;cur_bpi&lt;/tt&gt;&#160;value will quickly start to approach the actual usage of the filesystem oer the long term.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;For example, the inode weight could be limited to&#160;&lt;tt&gt;ia = min(2 * bytes_avail / cur_bpi, inodes_free) &amp;gt;&amp;gt; 8&lt;/tt&gt;&#160;and the bytes weight should be limited to&#160;&lt;tt&gt;ba = min(2 * inodes_free * cur_bpi, bytes_avail) &amp;gt;&amp;gt; 16&lt;/tt&gt;&#160;(possibly with other scaling factors depending on OST count/size). These values represent how many inodes or bytes can expect to be allocated by new objects based on the historical average bytes-per-inode usage of the filesystem. If a target has mostly large objects, then&#160;&lt;tt&gt;cur_bpi&lt;/tt&gt;&#160;would be large, so&#160;&lt;tt&gt;ia&lt;/tt&gt;&#160;would be limited by the&#160;&lt;tt&gt;2 * bytes_avail / cur_bpi&lt;/tt&gt;&#160;part and it doesn&apos;t matter how many&#160;&lt;em&gt;actually free&lt;/em&gt;&#160;inodes there are. Conversely, if&#160;&lt;tt&gt;cur_bpi&lt;/tt&gt;&#160;is small (below&#160;&lt;tt&gt;tot_bpi&lt;/tt&gt;&#160;means that the inodes would run out first) then&#160;&lt;tt&gt;2 * bytes_avail / cur_bpi&lt;/tt&gt;&#160;would be large and&#160;&lt;tt&gt;inodes_free&lt;/tt&gt;&#160;would be the limiting factor for allocations. In the middle, if the average object size is close to the mkfs limits, then both the free inodes and bytes would be taken into account.&lt;/p&gt;</description>
                <environment></environment>
        <key id="75323">LU-16678</key>
            <summary>QOS improvement</summary>
                <type id="6" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11315&amp;avatarType=issuetype">Story</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="scherementsev">Sergey Cheremencev</reporter>
                        <labels>
                    </labels>
                <created>Tue, 28 Mar 2023 17:08:55 +0000</created>
                <updated>Tue, 28 Mar 2023 17:08:55 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i03hgv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>