<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:58:00 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-13058] Intermediate component removal (PFL/SEL)</title>
                <link>https://jira.whamcloud.com/browse/LU-13058</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;This is a simple extension to the SEL functionality for PFL files, originally suggested by Andreas.&lt;br/&gt;
One of the features of SEL is that if there is an intermediate layout component (Consider a PFL layout like: DOM-&amp;gt;SSD-&amp;gt;HDD, in this case SSD is the intermediate component), it will not be instantiated if that tier is low on space, instead, the next component (HDD) is extended downwards. This allows us to skip the SSD tier if it&apos;s full.&lt;/p&gt;

&lt;p&gt;This is a nice feature, and there&apos;s no particular reason it has to be limited to SEL layouts. It&apos;s easy to do this for normal layouts, where the SSD component has a normally defined length.&lt;/p&gt;

&lt;p&gt;So, this patch adds that functionality. The canonical case is a DOM-&amp;gt;SSD-&amp;gt;HDD layout where the SSD tier is low on space (or even out of space entirely). Currently, when the first write happens to the SSD component, it&apos;s simply instantiated. If there is absolutely no space, an error results. With this feature, in the low on space condition*, that intermediate component is removed.&lt;/p&gt;

&lt;p&gt;*the same low on space condition as used in SEL, basically if one of the chosen OSTs is below the threshold value for striping. The stripe allocator will only stripe to these OSTs in absence of a better choice, so this indicates we&apos;re very low on space.&lt;/p&gt;


&lt;p&gt;There is one detail: The SEL code uses the &quot;extension size&quot; as a way to estimate how much space this component might use, so it&apos;s factored in to the &quot;low on space&quot; calculation. There is no obvious substitute for this with a regular file, which leads to two options:&lt;/p&gt;


&lt;p&gt;1. Act like the file will consume (effectively) zero space and only act if the OSTs are already low on space&lt;br/&gt;
2. Pick some amount of data to assume it will use - The most logical guess seems to be a multiple of stripe size, but perhaps an absolute value would be better, as stripe sizes can vary widely.&lt;/p&gt;

&lt;p&gt;It&apos;s not clear that 1 isn&apos;t fine, and in either case, this is just an optimization.&lt;/p&gt;

&lt;p&gt;Patch forthcoming.&lt;/p&gt;</description>
                <environment></environment>
        <key id="57575">LU-13058</key>
            <summary>Intermediate component removal (PFL/SEL)</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="paf0186">Patrick Farrell</assignee>
                                    <reporter username="paf0186">Patrick Farrell</reporter>
                        <labels>
                    </labels>
                <created>Sun, 8 Dec 2019 20:19:37 +0000</created>
                <updated>Tue, 30 May 2023 15:18:00 +0000</updated>
                                            <version>Lustre 2.14.0</version>
                    <version>Lustre 2.12.5</version>
                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="259426" author="gerrit" created="Sun, 8 Dec 2019 20:24:22 +0000"  >&lt;p&gt;Patrick Farrell (farr0186@gmail.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36953&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36953&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13058&quot; title=&quot;Intermediate component removal (PFL/SEL)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13058&quot;&gt;LU-13058&lt;/a&gt; lod: Intermediate component removal&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: ac5638e83a8436b3d46a2cd74634b2a8578dadb3&lt;/p&gt;</comment>
                            <comment id="259434" author="adilger" created="Mon, 9 Dec 2019 02:28:12 +0000"  >&lt;p&gt;It would make sense to use the size of the intermediate component as the threshold for whether there is enough space on the OST(s).&lt;/p&gt;</comment>
                            <comment id="259478" author="paf0186" created="Mon, 9 Dec 2019 17:46:03 +0000"  >&lt;p&gt;Ah, of course, yeah.&#160; I&apos;m stuck in the mindset of self extending layouts, where the component can change later.&#160; These are fixed from the beginning, so, yeah, component size kinda makes sense.&lt;/p&gt;

&lt;p&gt;Unfortunately, this will likely break a bunch of tests and may introduce some usability issues for developers...?&#160; Because if the size of your second component is (for example) 1 GiB, but you&apos;re running on the default llmount.sh config, that will &lt;b&gt;never&lt;/b&gt; show as having enough space for that.&#160; So essentially, creating a three component PFL layout on that test config and trying to instantiate that second component won&apos;t work, unless the components are very small.&lt;/p&gt;

&lt;p&gt;I&apos;ll give it a shot and see how many tests it breaks.&#160; Let me know if that adjusts your thinking or if you&apos;ve got an idea for coping with that.&lt;/p&gt;</comment>
                            <comment id="259498" author="adilger" created="Mon, 9 Dec 2019 23:14:44 +0000"  >&lt;p&gt;I agree that it is likely that some tests will fail if the default layout is changing.  I think in many cases the failures can be mitigated by small/sane changes to the layout used for a particular test, or by making the test smart enough to handle this.&lt;/p&gt;

&lt;p&gt;I think the first thing to do would be getting regular testing to pass with a default PFL layout, starting with patch &lt;a href=&quot;https://review.whamcloud.com/26576&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/26576&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11918&quot; title=&quot;Allow setting default file layout on root directory at mkfs time&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11918&quot;&gt;LU-11918&lt;/a&gt; tests: modify file system layout in testing&lt;/tt&gt;&quot;.  The results for that patch show there are already a number of subtests failing because that have built-in assumptions of &lt;tt&gt;stripe_count=1&lt;/tt&gt; or &lt;tt&gt;stripe-size=1MB&lt;/tt&gt; as the default filesystem layout.&lt;/p&gt;

&lt;p&gt;I&apos;d recommend to approach this in a systematic manner, first changing the filesystem default &lt;tt&gt;stripe_count=3&lt;/tt&gt; or similar and fixing subtests to handle the new default and/or explicitly specify the layout that they require for the test, then the default &lt;tt&gt;stripe_size=3MB&lt;/tt&gt; or whatever and repeat, then PFL layout with &lt;tt&gt;stripe_count=1, stripe_size=1MB&lt;/tt&gt; as the first component, etc.&lt;/p&gt;

&lt;p&gt;Without first addressing the hidden assumptions in the existing tests, I think that this will be a very large patch that conflates existing issues with potential new issues that are added with this additional change.&lt;/p&gt;

&lt;p&gt;That said, I &lt;b&gt;do&lt;/b&gt; like the idea you are proposing here.  In some respects, it would be nice if all components were treated like SEL components by default and users didn&apos;t have to explicitly set extension components, or worry about if some OST is going to run out of space. &lt;/p&gt;</comment>
                            <comment id="259545" author="paf0186" created="Tue, 10 Dec 2019 20:19:09 +0000"  >&lt;p&gt;OK, that makes sense.&#160; I can try to take a quick look at some of that at some point - I&apos;m doing this in my spare time, so it&apos;s uncertain how much I&apos;ll dig in to the other test stuff.&lt;/p&gt;

&lt;p&gt;&quot;In some respects, it would be nice if all components were treated like SEL components by default and users didn&apos;t have to explicitly set extension components, or worry about if some OST is going to run out of space.&quot;&lt;/p&gt;

&lt;p&gt;I think this is intriguing - It would be doable.&#160; Very doable, in fact, though the effects would be wide ranging.&#160; It would be a matter of converting the normal PFL component expression (with setstripe) to have an implicit -z, basically, and then I guess converting regular setstripe -c to make an SEL file rather than a plain file.&lt;/p&gt;

&lt;p&gt;So all first components (DOM excluded, since it has a fixed size) would start out small, and all other components would start out zero length, and all followed by extension space.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;There would be lots of ripple effects, but the only obvious question (to me) is how to set the extension size (the amount of space given out at a time).&#160; Perhaps something like 1% or 100 GiB, whichever is larger?&#160; I&apos;m not sure - Layout lock changes could be pretty disruptive for a large file being written in parallel.&#160; It seems like it would be important to allow &lt;b&gt;not&lt;/b&gt; doing this for that case.&#160; (Though I suppose if a file is striped widely as well, the data per stripe might not be much different from a single writer file being written quickly by one client, so the issue might be roughly the same.)&lt;/p&gt;

&lt;p&gt;Hm.&#160; The number of ripple effects and the complexity it introduces to regular layouts make me nervous.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&quot;That said, I&#160;&lt;b&gt;do&lt;/b&gt;&#160;like the idea you are proposing here.&quot;&lt;br/&gt;
 As I alluded to, you originally suggested it (the &quot;PFL could remove intermediate components&quot; bit) during SEL review. &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;</comment>
                            <comment id="259777" author="adilger" created="Thu, 12 Dec 2019 23:34:12 +0000"  >&lt;p&gt;I also thought of another important use case for this - skipping the PFL/SEL components for OST pools in which the user has no quota.  That depends on the OST pool quota feature (&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11023&quot; title=&quot;OST Pool Quotas&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11023&quot;&gt;&lt;del&gt;LU-11023&lt;/del&gt;&lt;/a&gt;) to be available, but seems like a logical extension of this work.&lt;/p&gt;</comment>
                            <comment id="259806" author="paf0186" created="Fri, 13 Dec 2019 16:17:25 +0000"  >&lt;p&gt;I agree, the only downside is that it seems like it would require a little bit of plumbing - Handling quotas was rejected as part of the SEL work (at least initially) for that reason.&lt;/p&gt;

&lt;p&gt;Although now that I think about it, my position at the time (in the design discussions within Cray) was based on the idea of integrating quota levels in to the stripe allocator decisions, which really would be kind of terrible.&lt;/p&gt;

&lt;p&gt;But if we assume that quota pools and OST tiers are arranged sanely (ie, the pools used for quota match up with the pools used in the layout/tiering), which I think is fair (since things won&apos;t &lt;b&gt;break&lt;/b&gt; if they are not - it will just give suboptimal behavior), then we could just make quota checking part of the &quot;are these selected OSTs OK&quot; step*, since the quota itself is split across the OSTs evenly.&lt;/p&gt;

&lt;p&gt;*ie, when we check the OSTs selected by the stripe allocator to verify space levels&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;It&apos;s not &lt;b&gt;quite&lt;/b&gt; as good as integrating quota in to the stripe allocator decisions, but that would be a huge amount of work and I think definitely overkill.&lt;/p&gt;

&lt;p&gt;So, yeah, that would be manageable I think.&#160; Just some extra plumbing to check quotas from the LOD context.&lt;/p&gt;

&lt;p&gt;But as you noted, pool quotas required first.&lt;/p&gt;</comment>
                            <comment id="271198" author="adilger" created="Tue, 26 May 2020 23:53:10 +0000"  >&lt;p&gt;Mike, this is very similar to the DoM component shrinking/removal that you implemented.  Would you be able to finish off Patrick&apos;s patch in time for 2.14?&lt;/p&gt;</comment>
                            <comment id="319272" author="adilger" created="Fri, 26 Nov 2021 23:04:28 +0000"  >&lt;p&gt;This may be &lt;em&gt;mostly&lt;/em&gt; unnecessary in the presence of the pool spill mechanism added in patch &lt;a href=&quot;https://review.whamcloud.com/43989&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/43989&lt;/a&gt; &quot;&lt;tt&gt;LU-14825 lod: pool spilling&lt;/tt&gt;&quot;.  Rather than drop the component with the full pool and move to the next one, pool spill reassigns that component to its specified &lt;tt&gt;spill_target&lt;/tt&gt; pool.&lt;/p&gt;

&lt;p&gt;That allows the admin to &quot;fix&quot; all layouts that are targeting a specific pool, including cases where removing a component wouldn&apos;t help because the &lt;b&gt;next&lt;/b&gt; (...) component is also on the same pool, but with a larger stripe count.  Also, it avoids unnecessarily inflating the stripe count when an early component is dropped.&lt;/p&gt;

&lt;p&gt;The one drawback of pool spill is that it is a single global parameter and does not allow the fine-grained control of the layout that SEL does (i.e. which pool to use for each component), but that is not (IMHO) going to be a common use case, since most users don&apos;t know how to set the layout.&lt;/p&gt;

&lt;p&gt;In summary, I&apos;m not against keeping this open to eventually land this patch, but I don&apos;t think it is as useful/important as it once was.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="48585">LU-10070</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="54758">LU-11918</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="66096">LU-15011</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="52249">LU-11023</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="76267">LU-16857</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="56945">LU-12785</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00qmv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>