<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:16:33 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-8324] HSM: prioritize HSM requests</title>
                <link>https://jira.whamcloud.com/browse/LU-8324</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Most of the time (unless the filesystem is full), RESTORE and REMOVE requests should be processed first as they have the highest priority from a user&apos;s point of view ; ARCHIVE requests should have a lower priority.&lt;/p&gt;</description>
                <environment></environment>
        <key id="37803">LU-8324</key>
            <summary>HSM: prioritize HSM requests</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="bougetq">Quentin Bouget</assignee>
                                    <reporter username="cealustre">CEA</reporter>
                        <labels>
                    </labels>
                <created>Fri, 24 Jun 2016 12:30:27 +0000</created>
                <updated>Mon, 25 Jan 2021 10:21:08 +0000</updated>
                                                                                <due></due>
                            <votes>2</votes>
                                    <watches>17</watches>
                                                                            <comments>
                            <comment id="156846" author="bougetq" created="Fri, 24 Jun 2016 13:15:35 +0000"  >&lt;p&gt;I am working on addind a dynamic policy to the coordinator that would define which requests are to be sent first to copytools.&lt;/p&gt;

&lt;p&gt;The policy I am currently implementing defines two levels of priority and allows administrators to set which kind of request gets which priority (default will be low_priority = [ ARCHIVE ], high_priority = [ RESTORE, CANCEL, REMOVE, ... ]). One could also set a ratio that the coordinator tries to follow to batch requests and send them to copytools (X% high_priority and (100 -X) % low_priority), this prevents starvation. The ratio is a soft limit (if there is too little of one priority level of request to fill the buffers the ratio is not used), this prevents from wasting time.&lt;/p&gt;</comment>
                            <comment id="156880" author="pjones" created="Fri, 24 Jun 2016 17:24:44 +0000"  >&lt;p&gt;ok Quentin. Let us know how you progress&lt;/p&gt;</comment>
                            <comment id="159530" author="rread" created="Thu, 21 Jul 2016 19:06:37 +0000"  >&lt;p&gt;I agree we need to prioritize these operations, however I don&#8217;t believe adding prioritization to the coordinator is right answer here. We are currently on a path that will turn the coordinator into a general purpose request queue, and this is not something that belongs in Lustre code and certainly not in the kernel. &lt;/p&gt;

&lt;p&gt;Instead, we should move the HSM request processing out of the kernel and into user space. Although Lustre will still need to keep track of the implicit restore requests triggered by file access, all other operations could be done without using a coordinator. Lustre should provide the mechanisms needed for a correct HSM system, and allow the user space tools manage all of the policies around what and when is copied and their priorities. &lt;br/&gt;
&#8232;I&#8217;m still thinking about exactly what this should look like, but  at a minimum an Archive operation begins with setting the EXISTS flag, and completes with setting ARCHIVE flag. If the file is modified after EXISTS is set, then the MDT will set the DIRTY flag and reject the ARCHIVE flag when mover attempts to set it later.&lt;/p&gt;

&lt;p&gt;A Restore operation is primarily a layout swap, though it may need to be a special case to ensure the RELEASED flag is cleared atomically with the swap. &lt;/p&gt;

&lt;p&gt;A Remove operation is done by clearing the EXISTS and ARCHIVE flags.&lt;/p&gt;

&lt;p&gt;The existing coordinator should remain in place for some time to continue to support current set of tools, but I would like to discourage adding further complexity, and solve issues like this in a different way. &lt;/p&gt;</comment>
                            <comment id="159701" author="hdoreau" created="Mon, 25 Jul 2016 09:08:14 +0000"  >&lt;p&gt;We would happily consider a more resilient and distributed mecanism for the coordinator. Nevertheless, I see it as a non-trivial project that should not block improvements of HSM, if it targets mid-term future (I have neither seen any design document nor heard any discussion about it).&lt;/p&gt;

&lt;p&gt;The patch has not been pushed yet but the solution that Quentin proposes is leightweight and elegant and I believe that it significantly improves the experience of using HSM in production.&lt;br/&gt;
It is more subjective, but I also find that it improves code quality and makes it easier to reason about the logic of the CDT, which would be helpful for future replacement work.&lt;/p&gt;</comment>
                            <comment id="159703" author="gerrit" created="Mon, 25 Jul 2016 09:32:28 +0000"  >&lt;p&gt;Quentin Bouget (quentin.bouget.ocre@cea.fr) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/21494&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/21494&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8324&quot; title=&quot;HSM: prioritize HSM requests&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8324&quot;&gt;LU-8324&lt;/a&gt; hsm: prioritize HSM requests&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 0655c8faf7cb7dcdb3b19dd761aad6c06fcda159&lt;/p&gt;</comment>
                            <comment id="180074" author="mrb" created="Mon, 9 Jan 2017 16:41:25 +0000"  >&lt;p&gt;Hello,&lt;br/&gt;
I would just like to say that we would be very keen on this kind of feature at Cambridge - I&apos;ve just run into this issue today where a single file restore operation is at the back of the queue behind ~10TB of archive jobs.&lt;/p&gt;

&lt;p&gt;I&apos;d be interested in testing this patch against one of our test filesystems, but I just wanted to add a comment that we would really appreciate having more ability to control the coordinator queue - whether it&apos;s in it&apos;s current state or some future tool as Robert suggests.&lt;/p&gt;

&lt;p&gt;Kind regards,&lt;br/&gt;
Matt Raso-Barnett&lt;br/&gt;
University of Cambridge&lt;/p&gt;</comment>
                            <comment id="185548" author="bougetq" created="Mon, 20 Feb 2017 12:45:51 +0000"  >&lt;p&gt;Hello Matt,&lt;/p&gt;

&lt;p&gt;I think the patch is mature enough for you to test it if you are still interested in it.&lt;/p&gt;</comment>
                            <comment id="191662" author="hdoreau" created="Wed, 12 Apr 2017 13:29:53 +0000"  >&lt;p&gt;Any chance for this patch to make it into 2.10? It is a &lt;em&gt;very&lt;/em&gt; useful feature for HSM users and we believe that the patch is mature.&lt;/p&gt;</comment>
                            <comment id="196385" author="spitzcor" created="Thu, 18 May 2017 19:09:21 +0000"  >&lt;p&gt;If not 2.10, it seems that 2.10.1 would be possible.&lt;/p&gt;</comment>
                            <comment id="196784" author="pjones" created="Tue, 23 May 2017 18:54:18 +0000"  >&lt;p&gt;I think that 2.10.1 is more likely option at this stage. It seems like there will be some discussions about this area at LUG next week.&lt;/p&gt;</comment>
                            <comment id="197885" author="gerrit" created="Fri, 2 Jun 2017 13:05:23 +0000"  >&lt;p&gt;Quentin Bouget (quentin.bouget@cea.fr) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/27394&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/27394&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8324&quot; title=&quot;HSM: prioritize HSM requests&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8324&quot;&gt;LU-8324&lt;/a&gt; hsm: prioritize HSM requests&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: f8d7f289866c2219d19a51693ae44cc6c3fdf867&lt;/p&gt;</comment>
                            <comment id="200059" author="gerrit" created="Fri, 23 Jun 2017 11:37:17 +0000"  >&lt;p&gt;Quentin Bouget (quentin.bouget@cea.fr) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/27800&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/27800&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8324&quot; title=&quot;HSM: prioritize HSM requests&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8324&quot;&gt;LU-8324&lt;/a&gt; hsm: ease the development of a different coordinator&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 0ee6e5d71c6f549b48be59607d5f55f28e950f47&lt;/p&gt;</comment>
                            <comment id="221602" author="nrutman" created="Fri, 23 Feb 2018 21:13:56 +0000"  >&lt;p&gt;This thread looks kind of dead, but we have a desire to see some prioritization mechanism as well.&lt;br/&gt;
Some options:&lt;br/&gt;
1. FIFO (today)&lt;br/&gt;
2. Restore-first. All restore requests are prioritized over archive requests. (Except in-progress archives.)&lt;br/&gt;
3. Archive-first. All archives are prioritized.&lt;br/&gt;
4. Interleaved. Archive and Restore requests are alternated, as long as some of each are waiting.&lt;br/&gt;
5. Tunable. Adjustable ratio of archive:restore processing. Maybe this covers the above 2-4 as well.&lt;br/&gt;
6. Batched. Archives and Restores are grouped into separate batches, potentially resulting in fewer tape swaps.&lt;br/&gt;
7. Time-boxed. A variant of batched; batch ends after a fixed time period.&lt;br/&gt;
Many other options I&apos;m sure...&lt;/p&gt;

&lt;p&gt;Ultimately I&apos;m in agreement with Robert Read&apos;s comment above that the prioritization should really be done outside of Lustre, but if the patch here implements #5 that might cover enough of the use cases to make most people happy...&lt;/p&gt;</comment>
                            <comment id="224256" author="gerrit" created="Thu, 22 Mar 2018 10:41:23 +0000"  >&lt;p&gt;Quentin Bouget (quentin.bouget@cea.fr) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/31723&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/31723&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8324&quot; title=&quot;HSM: prioritize HSM requests&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8324&quot;&gt;LU-8324&lt;/a&gt; hsm: prioritize one RESTORE once in a while&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: b30b036607a2bc4928e13e06462701bf5ba62d3d&lt;/p&gt;</comment>
                            <comment id="224266" author="bougetq" created="Thu, 22 Mar 2018 14:30:17 +0000"  >&lt;p&gt;The patch above is the shortest/simplest hack I could come up with to help bear with &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8324&quot; title=&quot;HSM: prioritize HSM requests&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8324&quot;&gt;LU-8324&lt;/a&gt; until a more definitive fix is developed (it is more of a band-aid than anything else).&lt;/p&gt;

&lt;p&gt;The idea is to use the times when the coordinator traverses its whole llog to &quot;force-schedule&quot; at least one RESTORE request. In practice, this means that you should see at least one RESTORE request scheduled every &quot;loop_period&quot; (the value in &lt;em&gt;/proc/&amp;lt;fsname&amp;gt;/mdt/&amp;lt;mdt-name&amp;gt;/hsm/loop_period&lt;/em&gt;) seconds.&lt;/p&gt;</comment>
                            <comment id="234006" author="gerrit" created="Wed, 26 Sep 2018 12:07:37 +0000"  >&lt;p&gt;Quentin Bouget (quentin.bouget@cea.fr) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/33239&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/33239&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8324&quot; title=&quot;HSM: prioritize HSM requests&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8324&quot;&gt;LU-8324&lt;/a&gt; hsm: prioritize one RESTORE once in a while&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 93364e9f3b0c9694904d2c1e2a687af61a980c1f&lt;/p&gt;</comment>
                            <comment id="234866" author="gerrit" created="Fri, 12 Oct 2018 23:50:17 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/31723/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/31723/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8324&quot; title=&quot;HSM: prioritize HSM requests&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8324&quot;&gt;LU-8324&lt;/a&gt; hsm: prioritize one RESTORE once in a while&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 0dce1ddefc673a3f39b4964d6b669e2a11aaf903&lt;/p&gt;</comment>
                            <comment id="246276" author="gerrit" created="Wed, 24 Apr 2019 07:16:16 +0000"  >&lt;p&gt;Quentin Bouget (quentin.bouget@cea.fr) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/34749&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34749&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8324&quot; title=&quot;HSM: prioritize HSM requests&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8324&quot;&gt;LU-8324&lt;/a&gt; hsm: prioritize one RESTORE once in a while&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 3fa2b1682755eeb988d10af53797a8d5e1a3679d&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="62469">LU-14363</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="52043">LU-10968</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="38054">LU-8382</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="29886" name="analyzer-lu-8324.sh" size="3376" author="bougetq" created="Thu, 22 Mar 2018 14:25:28 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzyfp3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>