<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:07:07 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-7231] ENOSPC on remote MDT might create a in-consistent striped directory</title>
                <link>https://jira.whamcloud.com/browse/LU-7231</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;In current DNE implementation, when creating a striped directory. In execution phase, the master MDT will only pack the remote updates inside the RPC, and these updates will not be executed until top_trans_stop() sends these updates to remote MDT. If these remote updates fails, for example ENOSPC for writing update log, but local updates succeeds, then we need rollback those local updates (or do better job during declaration?), otherwise the namespace space might be inconsistency. &lt;/p&gt;

&lt;p&gt;This can be reproduced by the test case in &lt;a href=&quot;http://review.whamcloud.com/16677&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/16677&lt;/a&gt;&lt;/p&gt;</description>
                <environment></environment>
        <key id="32389">LU-7231</key>
            <summary>ENOSPC on remote MDT might create a in-consistent striped directory</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="di.wang">Di Wang</assignee>
                                    <reporter username="di.wang">Di Wang</reporter>
                        <labels>
                    </labels>
                <created>Wed, 30 Sep 2015 01:59:55 +0000</created>
                <updated>Mon, 26 Oct 2015 18:43:59 +0000</updated>
                            <resolved>Mon, 26 Oct 2015 18:43:45 +0000</resolved>
                                    <version>Lustre 2.8.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="128855" author="bzzz" created="Wed, 30 Sep 2015 04:57:25 +0000"  >&lt;p&gt;instead we should reserve space, IMO. i.e. have grants for metadata.&lt;/p&gt;</comment>
                            <comment id="128856" author="di.wang" created="Wed, 30 Sep 2015 06:31:08 +0000"  >&lt;p&gt;I agree caching the status of the remote target and had it checked in the declare phase might be the right way to go. But for 2.8, is there better temporary way to fix it, instead of using this &quot;lfs rm_entry&quot; to delete this corrupted striped dir afterwards. (Btw: please check the patch &lt;a href=&quot;http://review.whamcloud.com/16677&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/16677&lt;/a&gt;,  thanks!).&lt;/p&gt;</comment>
                            <comment id="128863" author="bzzz" created="Wed, 30 Sep 2015 08:04:55 +0000"  >&lt;p&gt;it&apos;s probably not that trivial to rollback everything. say, a record in changelog. also, having a failure on a one target doesn&apos;t mean all the targets failed, right? then we&apos;d need to rollback those too.&lt;/p&gt;</comment>
                            <comment id="128936" author="adilger" created="Wed, 30 Sep 2015 18:33:51 +0000"  >&lt;p&gt;How hard would it be in the error handler (for -ENOSPC, or whatever else) to unlink the local name entry on the master and do a best effort to remove the remote directories?  No huge loss if the remote entries are leaked, but it doesn&apos;t make sense to return success creating a striped directory that isn&apos;t actually usable.&lt;/p&gt;</comment>
                            <comment id="128951" author="di.wang" created="Wed, 30 Sep 2015 20:23:30 +0000"  >&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;How hard would it be in the error handler (for -ENOSPC, or whatever else) to unlink the local name entry on the master and do a best effort to remove the remote directories?
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Oh, lfs mkdir will return error for this case, because the remote transaction will fail, and MDD can still track this error and reply it to client. But it will leave a corrupt striped directory on the server side, if we do not do anything there. Right now, the solution is that the user can delete this striped directory by himself with lfs rm_entry. See patch &lt;a href=&quot;http://review.whamcloud.com/16677&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/16677&lt;/a&gt;. &lt;/p&gt;</comment>
                            <comment id="128989" author="adilger" created="Thu, 1 Oct 2015 10:29:21 +0000"  >&lt;p&gt;Why not try to unlink the name on the master MDT and the remote slaves if there is an error?  Surely that is better than waiting for the client to do it?  Trying to clean up and failing is no worse than not trying at all and leaving a broken directory behind.&lt;/p&gt;</comment>
                            <comment id="131596" author="adilger" created="Mon, 26 Oct 2015 18:43:45 +0000"  >&lt;p&gt;This will be fixed by &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7230&quot; title=&quot;memory leak in sanityn.sh 90 &amp;amp; 91&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7230&quot;&gt;&lt;del&gt;LU-7230&lt;/del&gt;&lt;/a&gt;. &lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="32388">LU-7230</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxp33:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>