<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:32:59 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-17142] MGC long time connection</title>
                <link>https://jira.whamcloud.com/browse/LU-17142</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Initial state, n03 had MDT0 combined with MGS  and MDT1. Then failover for MDT1 and failback for MDT0, MDT1 was started first.&lt;br/&gt;
After MGS started, MGC did not connect to it on a same node. And getting config lock was unsuccessful, MDT0 failed to start.&lt;br/&gt;
Here is the connection attempts to MGS&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000100:02000400:1.0:1684425810.091309:0:67373:0:(import.c:1234:ptlrpc_connect_interpret()) Evicted from MGS (at 90@kfi) after server handle changed from 0xf8988cf2594b4a99 to 0xad30caba1bb5141f
00000100:00080000:17.0:1684425957.393672:0:988963:0:(import.c:534:import_select_connection()) MGC90@kfi: connect to NID 0@lo last attempt 0
00000100:00080000:17.0:1684425957.393676:0:988963:0:(import.c:615:import_select_connection()) MGC90@kfi: import 00000000e1884310 using connection MGC90@kfi_0/0@lo
00000100:00080000:15.0:1684426028.139221:0:67373:0:(import.c:1435:ptlrpc_connect_interpret()) recovery of MGS on MGC90@kfi_0 failed (-110)
00000100:00080000:12.0:1684426028.139231:0:968287:0:(import.c:534:import_select_connection()) MGC90@kfi: connect to NID 0@lo last attempt 1984805
00000100:00080000:12.0:1684426028.139233:0:968287:0:(import.c:534:import_select_connection()) MGC90@kfi: connect to NID 57@kfi last attempt 0
00000100:00080000:12.0:1684426028.139245:0:968287:0:(import.c:606:import_select_connection()) MGC90@kfi: Connection changing to MGS (at 57@kfi)
00000100:00080000:12.0:1684426028.139246:0:968287:0:(import.c:615:import_select_connection()) MGC90@kfi: import 00000000e1884310 using connection MGC90@kfi_1/57@kfi
00000100:00080000:0.0:1684426099.819204:0:67373:0:(import.c:1435:ptlrpc_connect_interpret()) recovery of MGS on 57@kfi failed (-110)
00000100:00080000:2.0:1684426099.819218:0:991028:0:(import.c:534:import_select_connection()) MGC90@kfi: connect to NID 0@lo last attempt 1984805
00000100:00080000:2.0:1684426099.819220:0:991028:0:(import.c:534:import_select_connection()) MGC90@kfi: connect to NID 57@kfi last attempt 1984876
00000100:00080000:2.0:1684426099.819221:0:991028:0:(import.c:581:import_select_connection()) MGC90@kfi: tried all connections, increasing latency to 66s
00000100:00080000:2.0:1684426099.819238:0:991028:0:(import.c:606:import_select_connection()) MGC90@kfi: Connection changing to MGS (at 0@lo)
00000100:00080000:2.0:1684426099.819240:0:991028:0:(import.c:615:import_select_connection()) MGC90@kfi: import 00000000e1884310 using connection MGC90@kfi_0/0@lo
20000000:00000040:11.0:1684426115.484210:0:995298:0:(mgs_handler.c:1397:mgs_init0()) MGS MGS started
00000100:00080000:4.0:1684426170.475202:0:67373:0:(import.c:1435:ptlrpc_connect_interpret()) recovery of MGS on 90@kfi failed (-110)
00000100:00080000:5.0:1684426170.475211:0:971094:0:(import.c:534:import_select_connection()) MGC90@kfi: connect to NID 0@lo last attempt 1984948
00000100:00080000:5.0:1684426170.475213:0:971094:0:(import.c:534:import_select_connection()) MGC90@kfi: connect to NID 57@kfi last attempt 1984876
00010000:00010000:11.0:1684426186.859281:0:995298:0:(ldlm_request.c:1045:ldlm_cli_enqueue()) ### client-side enqueue START, flags 0x1000000000000 ns: MGC90@kfi lock: 0000000054f13717/0xad30caba1bb518e1 lrc: 3/1,0 mode: --/CR res: [0x32316f6d6c6a6b:0x0:0x0].0x0 rrc: 2 type: PLN flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 995298 timeout: 0 lvb_type: 0
00010000:00000040:11.0:1684426186.859283:0:995298:0:(ldlm_resource.c:1648:ldlm_resource_putref()) putref res: 000000000b5a37f1 count: 1
00010000:00010000:11.0:1684426186.859285:0:995298:0:(ldlm_request.c:1132:ldlm_cli_enqueue()) ### sending request ns: MGC90@kfi lock: 0000000054f13717/0xad30caba1bb518e1 lrc: 3/1,0 mode: --/CR res: [0x32316f6d6c6a6b:0x0:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0x0 expref: -99 pid: 995298 timeout: 0 lvb_type: 0
00010000:00000040:11.0:1684426186.859287:0:995298:0:(ldlm_resource.c:1648:ldlm_resource_putref()) putref res: 000000000b5a37f1 count: 1
00000100:00000040:11.0:1684426186.859291:0:995298:0:(lustre_net.h:2404:ptlrpc_rqphase_move()) @@@ move request phase from New to Rpc&#160; req@00000000c8cf8fee x1764170257385152/t0(0) o101-&amp;gt;MGC90@kfi@57@kfi:26/25 lens 328/344 e 0 to 0 dl 0 ref 2 fl New:QU/0/ffffffff rc 0/-1 job:&apos;mount.lustre.0&apos;
00000100:00080000:11.0:1684426186.859294:0:995298:0:(client.c:1665:ptlrpc_send_new_req()) @@@ req waiting for recovery: (FULL != CONNECTING)&#160; req@00000000c8cf8fee x1764170257385152/t0(0) o101-&amp;gt;MGC90@kfi@57@kfi:26/25 lens 328/344 e 0 to 0 dl 0 ref 2 fl Rpc:WQU/0/ffffffff rc 0/-1 job:&apos;mount.lustre.0&apos;

00000100:00080000:11.0:1684426193.003178:0:995298:0:(client.c:1260:ptlrpc_import_delay_req()) @@@ send limit expired&#160; req@00000000c8cf8fee x1764170257385152/t0(0) o101-&amp;gt;MGC90@kfi@57@kfi:26/25 lens 328/344 e 0 to 0 dl 0 ref 2 fl Rpc:WQU/0/ffffffff rc 0/-1 job:&apos;mount.lustre.0&apos; 10000000:01000000:11.0:1684426193.003262:0:995298:0:(mgc_request.c:2136:mgc_process_log()) MGC90@kfi: configuration from log &apos;kjlmo12-MDT0000&apos; failed (-5).
00000020:02020000:11.0:1684426193.003265:0:995298:0:(obd_mount.c:109:lustre_process_log()) 15c-8: MGC90@kfi: Confguration from log kjlmo12-MDT0000 failed from MGS -5. Communication error between node &amp;amp; MGS, a bad configuration, or other errors. See syslog for more info
00000020:00020000:11.0:1684426193.022368:0:995298:0:(obd_mount_server.c:1425:server_start_targets()) failed to start server kjlmo12-MDT0000: -5
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;MGC reconnection through pinger waits all requests to be timeout before a new attempt. In some situation it leads to fail to connect to MGS on a local node and failure to start a MDT0000. &lt;/p&gt;</description>
                <environment>2 MDTs in failover pair</environment>
        <key id="78083">LU-17142</key>
            <summary>MGC long time connection</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="aboyko">Alexander Boyko</assignee>
                                    <reporter username="aboyko">Alexander Boyko</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Mon, 25 Sep 2023 12:36:39 +0000</created>
                <updated>Wed, 10 Jan 2024 16:47:12 +0000</updated>
                            <resolved>Sat, 18 Nov 2023 21:57:19 +0000</resolved>
                                    <version>Upstream</version>
                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="387082" author="gerrit" created="Mon, 25 Sep 2023 12:56:00 +0000"  >&lt;p&gt;&quot;Alexander Boyko &amp;lt;alexander.boyko@hpe.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/52498&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/52498&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17142&quot; title=&quot;MGC long time connection&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17142&quot;&gt;&lt;del&gt;LU-17142&lt;/del&gt;&lt;/a&gt; mgc: reconnection without pinger&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 5fce9cfd00a4527675b10e91bbae17fd35638355&lt;/p&gt;</comment>
                            <comment id="387404" author="eaujames" created="Wed, 27 Sep 2023 14:55:37 +0000"  >&lt;p&gt;Hi Alexander,&lt;/p&gt;

&lt;p&gt;Can this be related to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16204&quot; title=&quot;Connections from MGC to a Combined MGS/MDT on failover node not working &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16204&quot;&gt;LU-16204&lt;/a&gt;?&lt;/p&gt;</comment>
                            <comment id="387542" author="aboyko" created="Thu, 28 Sep 2023 09:53:59 +0000"  >&lt;p&gt;&amp;gt;Can this be related to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16204&quot; title=&quot;Connections from MGC to a Combined MGS/MDT on failover node not working &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16204&quot;&gt;LU-16204&lt;/a&gt;?&lt;br/&gt;
Not exactly.  The description shows connection problem to a local MGS node, when MGC and MGS started on the same node.&lt;/p&gt;</comment>
                            <comment id="393496" author="gerrit" created="Sat, 18 Nov 2023 21:41:57 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/52498/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/52498/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17142&quot; title=&quot;MGC long time connection&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17142&quot;&gt;&lt;del&gt;LU-17142&lt;/del&gt;&lt;/a&gt; mgc: reconnection without pinger&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 867ba433e3a0fce4a1b2f8d37a91d550ada41a26&lt;/p&gt;</comment>
                            <comment id="393526" author="pjones" created="Sat, 18 Nov 2023 21:57:19 +0000"  >&lt;p&gt;Landed for 2.16&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="79927">LU-17412</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="78103">LU-17147</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i03wnj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>