<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:18:58 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15514] Do not wait for clients to start recovery if there are no clients.</title>
                <link>https://jira.whamcloud.com/browse/LU-15514</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;With idle-disconnect code a situation can happen where entire cluster is idle for some time and as all the servers restart, the recovery on OSTs does not start as there are no client connections. The MDTs connections to OSTs are rejected because those are considered to be new connections.&lt;/p&gt;

&lt;p&gt;We need to either accept new MDTs in similar to how we do when MDT and OST are colocated on the same node or we need to start the recovry time on first such connection and then proceed with the eviction as the timeout expires to allow them to rejoin as the new clients they are.&lt;/p&gt;

&lt;p&gt;Failing to do this would cause entire cluster delay as the idle-disconnected clients become active again and would need to wait for the recovery to finish first even if the servers restart happened long ago&lt;/p&gt;</description>
                <environment></environment>
        <key id="68467">LU-15514</key>
            <summary>Do not wait for clients to start recovery if there are no clients.</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="green">Oleg Drokin</reporter>
                        <labels>
                    </labels>
                <created>Wed, 2 Feb 2022 22:22:16 +0000</created>
                <updated>Thu, 3 Feb 2022 03:03:09 +0000</updated>
                                            <version>Lustre 2.15.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="325039" author="adilger" created="Wed, 2 Feb 2022 23:50:11 +0000"  >&lt;p&gt;IIRC, the MDT-&amp;gt;OST connections &lt;em&gt;used&lt;/em&gt; to use a fixed UUID like &quot;&lt;tt&gt;$fsname-MDT0000_UUID&lt;/tt&gt;&quot; or something, so that they could always connect to the OSTs, even during recovery.  It&apos;s possible that this was changed at one point because of LWP/OUT connections, or something, but this should be investigated.  The MDT should not be blocked from connecting during recovery.&lt;/p&gt;

&lt;p&gt;Whether the MDT connections should trigger the recovery timer is a separate issue.  I think they should &lt;b&gt;not&lt;/b&gt;, otherwise if the storage cluster is disconnected from the compute nodes because of a brief switch problem then all clients would all be evicted.   However, as Oleg noted, if there are no other connections besides the MDT(s), then recovery could finish immediately.&lt;/p&gt;</comment>
                            <comment id="325063" author="green" created="Thu, 3 Feb 2022 03:03:09 +0000"  >&lt;p&gt;ye, MDT connection alone should not trigger start of recovery, except they are the only type of records in last_rcvd.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i02h4f:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>