<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:49:01 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12025] Adding OST may cause EIO - delay activation of new OSTs on existing filesystem</title>
                <link>https://jira.whamcloud.com/browse/LU-12025</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;During OST addition, &lt;br/&gt;
1. MDT gets configuration refreshment first.&lt;br/&gt;
2. Then MDT get create request from the client, might allocate the object the new OST, then reply to the client.&lt;br/&gt;
3. If client does not refresh its configuration yet, then do I/O with the EA, it might get EIO because it does not know the OST.&lt;/p&gt;

&lt;p&gt;So it probably needs add version to the config log to avoid this happen.&lt;/p&gt;

&lt;p&gt;This happens quite often in cloud environment.   though not sure if there is duplicate ticket already.&lt;/p&gt;




</description>
                <environment></environment>
        <key id="55005">LU-12025</key>
            <summary>Adding OST may cause EIO - delay activation of new OSTs on existing filesystem</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="adilger">Andreas Dilger</assignee>
                                    <reporter username="di.wang">Di Wang</reporter>
                        <labels>
                    </labels>
                <created>Wed, 27 Feb 2019 03:34:02 +0000</created>
                <updated>Sun, 19 Nov 2023 00:24:42 +0000</updated>
                            <resolved>Fri, 8 Nov 2019 07:56:00 +0000</resolved>
                                                    <fixVersion>Lustre 2.13.0</fixVersion>
                    <fixVersion>Lustre 2.12.4</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="242913" author="adilger" created="Wed, 27 Feb 2019 05:20:34 +0000"  >&lt;p&gt;I saw another related comment recently about the desire to be able to configure new OSTs on the filesystem, but not have them immediately active on the MDS until the administrator wants them enabled.&lt;/p&gt;

&lt;p&gt;I suspect there are a number of simple mechanisms that might allow this to happen:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;have a time delay between when the OST is first added to the filesystem before it can be used for object allocation, maybe 30-60s IFF the OSTs are added and there are clients already mounted and the MDS is not &lt;em&gt;just&lt;/em&gt; newly formatted.  I suspect we already have workarounds for this issue in &lt;tt&gt;test-framework.sh&lt;/tt&gt; (e.g. sleep for some time until the OST is visible on the client) because we are constantly formatting and adding new OSTs to the filesystem, but possibly we are mounting the client after the OSTs and don&apos;t see it.  There are probably a &lt;em&gt;few&lt;/em&gt; places where an OST is added to a running filesystem during a test, but maybe we don&apos;t try to write to it immediately.&lt;/li&gt;
	&lt;li&gt;pass a flag from the OST at the first connect time to the MDS that indicates the OST should be inactive until the admin enables it manually.  This may be desirable for a number of reasons.  This could be done by setting &lt;tt&gt;osp.&amp;lt;fsname&amp;gt;-OSTnnnn.max_create_count=0&lt;/tt&gt; on the MDS for the new OSTnnnn the first time it is connected.&lt;/li&gt;
	&lt;li&gt;mark the OST unavailable locally (e.g. &lt;tt&gt;OS_STATE_ENOINO&lt;/tt&gt; or something set via &lt;tt&gt;mkfs.lustre&lt;/tt&gt;) &lt;em&gt;before&lt;/em&gt; it mounts and tries to connect to the MDS the first time so there is no window on the MDS when there is a problem, and it keeps the implementation more localized instead of depending on the MDS as well&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Allowing both modes (autmatic activation after timeout, wait for manual activation) may be useful under different circumstances.&lt;/p&gt;</comment>
                            <comment id="242965" author="di.wang" created="Wed, 27 Feb 2019 17:48:07 +0000"  >&lt;p&gt;My initial thought is MGS maintains a version number, which can bump up when new target is added (or removed?), then all other nodes will get the version number when it fetches the config log from MGS.   when client send request to server, if their version  number does not match, then either client or server will needs to refresh their configuration from MGS. &lt;/p&gt;

&lt;p&gt;Or even further, each server target can also maintain its own version number (when it is being added to MGS), then the target can process the request as long as the req version number is newer than the target version.&lt;/p&gt;</comment>
                            <comment id="242972" author="adilger" created="Wed, 27 Feb 2019 19:59:03 +0000"  >&lt;p&gt;It isn&apos;t clear that having a per-target version number would help.  The target being added (OST) can provide a version number, but the client shouldn&apos;t have to pass its &quot;connect/config version&quot; for every OST it knows about to the MDS for every file it is creating.&lt;/p&gt;

&lt;p&gt;Sending a single &quot;config record version&quot; (really just the last MGS config record number that the client processed) from the client to the MDS with each create would be more useful, since this would (indirectly) tell the MDS which OSTs the client is connected to and it could skip ones that were added after that version.  Something like storing a &quot;minimum config record number&quot; on each target in LOD which is the config llog record in which the OST was added, and the client request would include their &quot;current config record number&quot; along with each request.   A check like:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (req-&amp;gt;current_config_rec &amp;lt; lod-&amp;gt;target[ost_idx]-&amp;gt;tgt_min_config_rec)
                &lt;span class=&quot;code-keyword&quot;&gt;continue&lt;/span&gt;;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;during create would be enough to skip the OST for that create.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;HOWEVER&lt;/b&gt; this has some significant drawbacks:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;it needs a protocol change so that clients will always send this field with each create request and the MDS will check it, to fix a problem that happens very rarely for most users&lt;/li&gt;
	&lt;li&gt;the config llog record numbers &lt;em&gt;may&lt;/em&gt; not be easily accessible in the right parts of the code (haven&apos;t looked at that yet)&lt;/li&gt;
	&lt;li&gt;it would only fix the problem of &lt;b&gt;one&lt;/b&gt; client creating and using a file &lt;b&gt;itself&lt;/b&gt;, but would not fix the problem of a &lt;b&gt;different&lt;/b&gt; client trying to access that file before it had processed the config llog updates (which may be delayed tens of seconds if there are many clients)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;So, my suggestion for a simpler solution is just to use a timeout (maybe variable), based on how quickly config llog records are processed by clients, before an OST (or MDT, for DNE) can be used for new file allocations.  There is already a small delay before the OST could be used, because the MDS needs to precreate objects there, but that is only a fraction of a second in most cases.  Instead of storing the &quot;config version&quot; in the LOD target, store the &quot;config time&quot; for the target, and skip it for new allocations for e.g. 10s after it connects.  This should handle the case where the MDS itself was just mounted and all OSTs are pre-existing (e.g. only delay usage if the &lt;tt&gt;lov_objids&lt;/tt&gt; entry was just added).&lt;/p&gt;</comment>
                            <comment id="242975" author="di.wang" created="Wed, 27 Feb 2019 20:19:54 +0000"  >&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;it would only fix the problem of one client creating and using a file itself, but would not fix the problem of a different client trying to access that file before it had processed the config llog updates (which may be delayed tens of seconds if there are many clients)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Oh, I mean the stale client should not be allowed to access the &quot;newer&quot; server until it refreshes its own config from MGS, but this indeed needs a lot changes, probably not worth for this as you said.  Adding timeout is probably good enough here. thanks.&lt;/p&gt;
</comment>
                            <comment id="242982" author="adilger" created="Wed, 27 Feb 2019 23:00:19 +0000"  >&lt;p&gt;In relation to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11963&quot; title=&quot;Add nonrotational flag to obd_statfs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11963&quot;&gt;&lt;del&gt;LU-11963&lt;/del&gt;&lt;/a&gt;, one easy way to allow a new OST to be added to the filesystem that would still prevent the MDS from using it immediately would be to send the &quot;&lt;tt&gt;OS_STATE_NOPRECREATE&lt;/tt&gt;&quot; flag from the OST in the &lt;tt&gt;OST_STATFS&lt;/tt&gt; RPC reply.  The MDS already checks for this (it sets it internally), similar to the &quot;&lt;tt&gt;OS_STATE_DEGRADED&lt;/tt&gt;&quot; flag, but it is an absolute rather than a hint.&lt;/p&gt;

&lt;p&gt;The code in &lt;tt&gt;osp_pre_update_status()&lt;/tt&gt; needs to be cleaned up to avoid &lt;b&gt;clearing&lt;/b&gt; the &lt;tt&gt;OS_STATE_NOPRECREATE&lt;/tt&gt; flag (and probably &lt;tt&gt;OS_STATE_ENOSPC&lt;/tt&gt; and &lt;tt&gt;OS_STATE_ENOINO&lt;/tt&gt;, so the OST could send these itself as well).  The &lt;tt&gt;obd_statfs&lt;/tt&gt; structure is refreshed every few seconds, so the MDS shouldn&apos;t need to clear the flags it sets itself.&lt;/p&gt;

&lt;p&gt;Also, it appears there is a bit of a race in &lt;tt&gt;osp_statfs_interpret()&lt;/tt&gt; calling &lt;tt&gt;osp_pre_update_status()&lt;/tt&gt;:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
        d-&amp;gt;opd_statfs = *msfs;

        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (d-&amp;gt;opd_pre)
                osp_pre_update_status(d, rc);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;since there is a window between  when the new &lt;tt&gt;&amp;#42;msfs&lt;/tt&gt; is stored in &lt;tt&gt;opd_statfs&lt;/tt&gt; that is returned by &lt;tt&gt;osp_statfs()&lt;/tt&gt;, and when &lt;tt&gt;osp_pre_update_status()&lt;/tt&gt; might set the various &lt;tt&gt;OS_STATE_&amp;#42;&lt;/tt&gt; flags that the create threads check.  That may allow file creation on an OST that should otherwise be unavailable (out of space, disabled, etc).  Instead, the new &lt;tt&gt;&amp;#42;msfs&lt;/tt&gt; should be passed as an argument to &lt;tt&gt;osp_pre_update_status()&lt;/tt&gt; and updated &lt;em&gt;before&lt;/em&gt; it is stored into &lt;tt&gt;opd_statfs&lt;/tt&gt;.&lt;/p&gt;</comment>
                            <comment id="243099" author="pfarrell" created="Thu, 28 Feb 2019 22:31:34 +0000"  >&lt;p&gt;I think the request from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12036&quot; title=&quot;Add option to create new OSTs inactive&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12036&quot;&gt;&lt;del&gt;LU-12036&lt;/del&gt;&lt;/a&gt;&#160;could/should probably be integrated here.&lt;/p&gt;</comment>
                            <comment id="248184" author="gerrit" created="Sat, 1 Jun 2019 05:05:21 +0000"  >&lt;p&gt;Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/35029&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/35029&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12025&quot; title=&quot;Adding OST may cause EIO - delay activation of new OSTs on existing filesystem&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12025&quot;&gt;&lt;del&gt;LU-12025&lt;/del&gt;&lt;/a&gt; osp: allow OS_STATE_* flags from OSTs&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 1227336419caed88199f8d21076fb9358070f004&lt;/p&gt;</comment>
                            <comment id="256886" author="gerrit" created="Tue, 22 Oct 2019 23:57:28 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/35029/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/35029/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12025&quot; title=&quot;Adding OST may cause EIO - delay activation of new OSTs on existing filesystem&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12025&quot;&gt;&lt;del&gt;LU-12025&lt;/del&gt;&lt;/a&gt; osp: allow OS_STATE_* flags from OSTs&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 9b0ebf78f7919a144673edadc4a95bad84fae2d3&lt;/p&gt;</comment>
                            <comment id="257983" author="adilger" created="Fri, 8 Nov 2019 07:56:00 +0000"  >&lt;p&gt;This allows the OST to completely disable itself from precreation without having to hack around in the state.  Using &quot;degraded&quot; is only partially disabled (can still be used in emergency), and faking &quot;out of space&quot; is a hack.&lt;/p&gt;

&lt;p&gt;In &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12036&quot; title=&quot;Add option to create new OSTs inactive&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12036&quot;&gt;&lt;del&gt;LU-12036&lt;/del&gt;&lt;/a&gt; we should allow mounting the OST with the &quot;&lt;tt&gt;OS_STATE_NOPRECREATE&lt;/tt&gt;&quot; flag set, so that it can be mounted but it will not be used.&lt;/p&gt;</comment>
                            <comment id="258838" author="gerrit" created="Tue, 26 Nov 2019 15:46:55 +0000"  >&lt;p&gt;Minh Diep (mdiep@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36872&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36872&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12025&quot; title=&quot;Adding OST may cause EIO - delay activation of new OSTs on existing filesystem&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12025&quot;&gt;&lt;del&gt;LU-12025&lt;/del&gt;&lt;/a&gt; osp: allow OS_STATE_* flags from OSTs&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 8f2b85288f973681bd8bd6d8a04f421f57a78a04&lt;/p&gt;</comment>
                            <comment id="259764" author="gerrit" created="Thu, 12 Dec 2019 23:05:45 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/36872/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36872/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12025&quot; title=&quot;Adding OST may cause EIO - delay activation of new OSTs on existing filesystem&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12025&quot;&gt;&lt;del&gt;LU-12025&lt;/del&gt;&lt;/a&gt; osp: allow OS_STATE_* flags from OSTs&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: b0194200146a54ee45df208da88dcc6b916fb51f&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="55032">LU-12036</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="79002">LU-17299</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="57439">LU-12998</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00cdr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>