<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:14:58 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15043] OST spill pools should not allow spill pool loops</title>
                <link>https://jira.whamcloud.com/browse/LU-15043</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Using the latest build of Lustre, 2.14.54_92 build # 4421, I created a spill pool loop and encountered some unexpected behavior. &lt;br/&gt;
I created three pools and created a loop of spill pools where pool1.spill= pool2, pool2.spill=pool3 and pool3.spill=pool1. I then created a file on pool1, but the file was created on pool2. The same thing happened when I created a file on pool2 and on pool3, they were created on pool3 and pool1, respectively. &lt;/p&gt;

&lt;p&gt;I think we should not allow spill pool loops to be created. &lt;/p&gt;

&lt;p&gt;Here are more details:&lt;br/&gt;
Created three pools: &lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lfs pool_list scratch.pool1
Pool: scratch.pool1
scratch-OST0000_UUID
# lfs pool_list scratch.pool2
Pool: scratch.pool2
scratch-OST0001_UUID
# lfs pool_list scratch.pool3
Pool: scratch.pool3
scratch-OST0002_UUID
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Set spill pool and thresholds on both MDSs:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;mds1# lctl get_param lod.scratch-MDT*.pool.*.spill*
lod.scratch-MDT0000-mdtlov.pool.pool1.spill_is_active=1
lod.scratch-MDT0000-mdtlov.pool.pool1.spill_target=pool2
lod.scratch-MDT0000-mdtlov.pool.pool1.spill_threshold_pct=5
lod.scratch-MDT0000-mdtlov.pool.pool2.spill_is_active=1
lod.scratch-MDT0000-mdtlov.pool.pool2.spill_target=pool3
lod.scratch-MDT0000-mdtlov.pool.pool2.spill_threshold_pct=5
lod.scratch-MDT0000-mdtlov.pool.pool3.spill_is_active=1
lod.scratch-MDT0000-mdtlov.pool.pool3.spill_target=pool1
lod.scratch-MDT0000-mdtlov.pool.pool3.spill_threshold_pct=5
lod.scratch-MDT0002-mdtlov.pool.pool1.spill_is_active=1
lod.scratch-MDT0002-mdtlov.pool.pool1.spill_target=pool2
lod.scratch-MDT0002-mdtlov.pool.pool1.spill_threshold_pct=5
lod.scratch-MDT0002-mdtlov.pool.pool2.spill_is_active=1
lod.scratch-MDT0002-mdtlov.pool.pool2.spill_target=pool3
lod.scratch-MDT0002-mdtlov.pool.pool2.spill_threshold_pct=5
lod.scratch-MDT0002-mdtlov.pool.pool3.spill_is_active=1
lod.scratch-MDT0002-mdtlov.pool.pool3.spill_target=pool1
lod.scratch-MDT0002-mdtlov.pool.pool3.spill_threshold_pct=5
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We see the following in dmesg on mds1, not on mds2:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[ 9046.643396] LustreError: 5659:0:(qmt_pool.c:1390:qmt_pool_add_rem()) add to: can&apos;t scratch-QMT0000 scratch-OST0000_UUID pool pool1: rc = -17
[ 9056.957864] LustreError: 5666:0:(qmt_pool.c:1390:qmt_pool_add_rem()) add to: can&apos;t scratch-QMT0000 scratch-OST0001_UUID pool pool2: rc = -17
[ 9065.980468] LustreError: 5674:0:(qmt_pool.c:1390:qmt_pool_add_rem()) add to: can&apos;t scratch-QMT0000 scratch-OST0002_UUID pool pool3: rc = -17
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Create files on specific OST pools:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lfs setstripe -p pool1 -c -1 /lustre/scratch/file1
# lfs getstripe -p /lustre/scratch/file1
pool2
# lfs setstripe -p pool2 -c -1 /lustre/scratch/file2
# lfs getstripe -p /lustre/scratch/file2
pool3
# lfs setstripe -p pool3 -c -1 /lustre/scratch/file3
# lfs getstripe -p /lustre/scratch/file3
pool1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We see the following on MDS0:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[10198.677195] Lustre: 1506:0:(lod_pool.c:799:lod_check_and_spill_pool()) scratch-MDT0000-mdtlov: more than 10 levels of pool spill for &apos;pool1-&amp;gt;pool2&apos;
[10223.616652] Lustre: 1506:0:(lod_pool.c:799:lod_check_and_spill_pool()) scratch-MDT0000-mdtlov: more than 10 levels of pool spill for &apos;pool2-&amp;gt;pool3&apos;
[10234.693511] Lustre: 1538:0:(lod_pool.c:799:lod_check_and_spill_pool()) scratch-MDT0000-mdtlov: more than 10 levels of pool spill for &apos;pool3-&amp;gt;pool1&apos; 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment></environment>
        <key id="66292">LU-15043</key>
            <summary>OST spill pools should not allow spill pool loops</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bzzz">Alex Zhuravlev</assignee>
                                    <reporter username="jamesanunez">James Nunez</reporter>
                        <labels>
                    </labels>
                <created>Tue, 28 Sep 2021 17:45:46 +0000</created>
                <updated>Mon, 27 Jun 2022 15:44:37 +0000</updated>
                            <resolved>Mon, 27 Jun 2022 15:44:37 +0000</resolved>
                                    <version>Lustre 2.15.0</version>
                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="314203" author="adilger" created="Tue, 28 Sep 2021 19:23:25 +0000"  >&lt;p&gt;James, the &quot;&lt;tt&gt;more than 10 levels of pool spill&lt;/tt&gt;&quot; logic in the code is to prevent an infinite loop in the kernel as it follows the circular linked list of spill targets.  It looks like it stops at the 10th level of pool spilling (3 full loops plus 1), which explains the &lt;tt&gt;-p pool1&lt;/tt&gt; creating a file in &lt;tt&gt;pool2&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;Detecting a loop at the time &lt;tt&gt;spill_target&lt;/tt&gt; is set should be relatively simple to implement.  In &lt;tt&gt;lod_spill_target_seq_write()&lt;/tt&gt; it should first copy the specified target into a temporary buffer, rather than &lt;tt&gt;pool_spill_target&lt;/tt&gt;, and follow the specified target pool until it hits a pool with no &lt;tt&gt;spill_target&lt;/tt&gt; set (the normal case), or the &lt;tt&gt;pool_spill_target&lt;/tt&gt; is the same as &lt;tt&gt;pool-&amp;gt;pool_name&lt;/tt&gt;.&lt;/p&gt;</comment>
                            <comment id="314245" author="gerrit" created="Wed, 29 Sep 2021 06:17:08 +0000"  >&lt;p&gt;&quot;Alex Zhuravlev &amp;lt;bzzz@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/45083&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45083&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15043&quot; title=&quot;OST spill pools should not allow spill pool loops&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15043&quot;&gt;&lt;del&gt;LU-15043&lt;/del&gt;&lt;/a&gt; lod: check for spilling loops&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 14d441c9bdb4873e4b0255658873ee963828548e&lt;/p&gt;</comment>
                            <comment id="314334" author="adilger" created="Wed, 29 Sep 2021 20:10:45 +0000"  >&lt;p&gt;The &lt;tt&gt;qmt_pool_add_rem()&lt;/tt&gt; message started appearing on 2021-08-18, and there have been a few thousand hits per day.  Patches landed on that day are listed below (no other patches landed after 2021-08-10 or before 2021-08-25):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;$ git log --oneline --before 2021-08-22  --after 2021-08-16
d8204f903a (tag: v2_14_54, tag: 2.14.54) New tag 2.14.54
5220160648 LU-14093 lutf: fix build with gcc10
a205334da5 LU-14903 doc: update lfs-setdirstripe man page
1313cad7a1 LU-14899 ldiskfs: Add 5.4.136 mainline kernel support
c44afcfb72 LU-12815 socklnd: set conns_per_peer based on link speed
6e30cd0844 LU-14871 kernel: kernel update RHEL7.9 [3.10.0-1160.36.2.el7]
14b8276e06 LU-14865 utils: llog_reader.c printf type mismatch
aa5d081237 LU-9859 lnet: fold lprocfs_call_handler functionality into lnet_debugfs_*
e423a0bd7a LU-14787 libcfs: Proved an abstraction for AS_EXITING
76c71a167b LU-14775 kernel: kernel update SLES12 SP5 [4.12.14-122.74.1]
67752f6db2 LU-14773 tests: skip check_network() on working node
024f9303bc LU-14668 lnet: Lock primary NID logic
684943e2d0 LU-14668 lnet: peer state to lock primary nid
16321de596 LU-14661 obdclass: Add peer/peer NI when processing llog
ac201366ad LU-14661 lnet: Provide kernel API for adding peers
51350e9b73 LU-14531 osd: serialize access to object vs object destroy
a5cbe7883d LU-12815 socklnd: allow dynamic setting of conns_per_peer
d13d8158e8 LU-14093 mgc: rework mgc_apply_recover_logs() for gcc10
8dd4488a07 LU-6142 tests: remove iam_ut binary
301d76a711 LU-14876 out: don&apos;t connect to busy MDS-MDS export
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The graph shows occurrences by subtest, it looks like this happens in any subtest that is adding a pool:&lt;br/&gt;
&lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;a id=&quot;40743_thumb&quot; href=&quot;https://jira.whamcloud.com/secure/attachment/40743/40743_Screen+Shot+2021-09-29+at+14.03.14.png&quot; title=&quot;Screen Shot 2021-09-29 at 14.03.14.png&quot; file-preview-type=&quot;image&quot; file-preview-id=&quot;40743&quot; file-preview-title=&quot;Screen Shot 2021-09-29 at 14.03.14.png&quot;&gt;&lt;img src=&quot;https://jira.whamcloud.com/secure/thumbnail/40743/_thumb_40743.png&quot; style=&quot;border: 0px solid black&quot; role=&quot;presentation&quot;/&gt;&lt;/a&gt;&lt;/span&gt; &lt;/p&gt;</comment>
                            <comment id="338841" author="gerrit" created="Mon, 27 Jun 2022 04:37:56 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/45083/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45083/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15043&quot; title=&quot;OST spill pools should not allow spill pool loops&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15043&quot;&gt;&lt;del&gt;LU-15043&lt;/del&gt;&lt;/a&gt; lod: check for spilling loops&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: c9c842d678e38345c890c1514e9b922fe496dba7&lt;/p&gt;</comment>
                            <comment id="338907" author="pjones" created="Mon, 27 Jun 2022 15:44:37 +0000"  >&lt;p&gt;Landed for 2.16&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="66096">LU-15011</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="66359">LU-15055</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="40743" name="Screen Shot 2021-09-29 at 14.03.14.png" size="54139" author="adilger" created="Wed, 29 Sep 2021 20:04:07 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i025ov:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>