<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:14:33 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-8089] MGT/MDT mount fails on secondary HA node</title>
                <link>https://jira.whamcloud.com/browse/LU-8089</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The error happens during soak testing of build &apos;20160427&apos; (see &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160427&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160427&lt;/a&gt;). DNE is enabled. OSTs had been formatted with zfs, MDT&apos;s using ldiskfs as storage backend. There&apos;s 1 MDT per MDS and 4 OSTs per OSS. The OSS and MDT nodes are configured in HA active-active failover configuration. &lt;/p&gt;

&lt;p&gt;The configuration, especially the mapping of node to role can be found here: &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-Configuration&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-Configuration&lt;/a&gt;.&lt;br/&gt;
For this tickets MDS nodes &lt;tt&gt;lola-&lt;span class=&quot;error&quot;&gt;&amp;#91;8,9&amp;#93;&lt;/span&gt;&lt;/tt&gt; which form a HA cluster are of interest.&lt;/p&gt;

&lt;p&gt;After a hard failover of the primary node (lola-8) had been initiated using &lt;tt&gt;pm -c&lt;/tt&gt; , mounting the MGT/MDT (one device) on the secondary node (lola-9) failed with error message:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Apr 29 14:00:36 lola-9 kernel: LDISKFS-fs warning (device dm-9): ldiskfs_multi_mount_protect: MMP interval 42 higher than expected, pleas
e wait.
...
...
Apr 29 14:19:06 lola-9 kernel: Lustre: Skipped 2 previous similar messages
Apr 29 14:19:06 lola-9 kernel: LustreError: 32539:0:(genops.c:334:class_newdev()) Device MGC192.168.1.109@o2ib10 already exists at 2, won&apos;t add
Apr 29 14:19:06 lola-9 kernel: LustreError: 32539:0:(obd_config.c:370:class_attach()) Cannot create device MGC192.168.1.109@o2ib10 of type mgc : -17
Apr 29 14:19:06 lola-9 kernel: LustreError: 32539:0:(obd_mount.c:198:lustre_start_simple()) MGC192.168.1.109@o2ib10 attach error -17
Apr 29 14:19:06 lola-9 kernel: LustreError: 32539:0:(obd_mount_server.c:1512:server_put_super()) no obd soaked-MDT0000
Apr 29 14:19:06 lola-9 kernel: LustreError: 32539:0:(obd_mount_server.c:140:server_deregister_mount()) soaked-MDT0000 not registered
Apr 29 14:19:06 lola-9 kernel: Lustre: 7204:0:(service.c:2096:ptlrpc_server_handle_request()) Skipped 9 previous similar messages
Apr 29 14:19:06 lola-9 kernel: Lustre: server umount soaked-MDT0000 complete
Apr 29 14:19:06 lola-9 kernel: LustreError: 32539:0:(obd_mount.c:1450:lustre_fill_super()) Unable to mount  (-17)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The exact sequence of events reads as:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;2016-04-29 13:53:59,419:fsmgmt.fsmgmt:INFO     triggering fault mds_failover  for lola-8&lt;/li&gt;
	&lt;li&gt;2016-04-29 14:00:25,257:fsmgmt.fsmgmt:INFO     lola-8 is up!!!&lt;/li&gt;
	&lt;li&gt;2016-04-29 14:00:36,270:fsmgmt.fsmgmt:INFO     Mounting soaked-MDT0000 on lola-9 ...&lt;/li&gt;
	&lt;li&gt;2016-04-29 14:19:06   mount failed with message above&lt;br/&gt;
&lt;b&gt;Attached files:&lt;/b&gt;&lt;br/&gt;
lola-9 messages, console files, debug log files lustre-log.1461964594.5065 and lustre-log.1461964707.7204&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The effect can be reproduced manually.&lt;br/&gt;
The device could be mounted &lt;em&gt;not before&lt;/em&gt; mdt-1 (primary resource of lola-9) were umounted and Lustre modules had been reloaded.&lt;/p&gt;
&lt;h3&gt;&lt;a name=&quot;Trymanually&quot;&gt;&lt;/a&gt;Try manually&lt;/h3&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@lola-9 ~]# mount -t lustre -o rw,user_xattr /dev/disk/by-id/dm-name-360080e50002ff4f00000026952013088p1 /mnt/soaked-mdt0/          mount.lustre: mount /dev/mapper/360080e50002ff4f00000026952013088p1 at /mnt/soaked-mdt0 failed: File exists
[root@lola-9 ~]# mount -t lustre -o rw,user_xattr /dev/disk/by-id/dm-name-360080e50002ff4f00000026952013088p1 /mnt/soaked-mdt0/
mount.lustre: mount /dev/mapper/360080e50002ff4f00000026952013088p1 at /mnt/soaked-mdt0 failed: File exists
[root@lola-9 ~]# date
Mon May  2 00:39:25 PDT 2016
[root@lola-9 ~]# date
Mon May  2 00:39:52 PDT 2016
[root@lola-9 ~]# lctl debug_kernel /tmp/lustre-log.2016-05-02-0040
Debug log: 2321 lines, 2321 kept, 0 dropped, 0 bad.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;b&gt;---&amp;gt;  attached&lt;/b&gt; file lustre-log.2016-05-02-0040 also attached&lt;/p&gt;
&lt;h3&gt;&lt;a name=&quot;umountreloadLustremodulesonlola9&quot;&gt;&lt;/a&gt;umount + reload Lustre modules on lola-9&lt;/h3&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;##### Umount and remove lustre mods
[root@lola-9 ~]# umount  /mnt/soaked-mdt1
[root@lola-9 ~]# lustre_rmmod
[root@lola-9 ~]# echo $?
0
[root@lola-9 ~]# lustre_rmmod
[root@lola-9 ~]# lctl dl
[root@lola-9 ~]# echo $?
2
[root@lola-9 ~]# date
Mon May  2 00:41:10 PDT 2016
[root@lola-9 ~]# mount -t lustre -o rw,user_xattr /dev/disk/by-id/dm-name-360080e50002ff4f00000026952013088p1 /mnt/soaked-mdt0
[root@lola-9 ~]# mount -t lustre -o rw,user_xattr /dev/mapper/360080e50002ff4f00000026d52013098p1 /mnt/soaked-mdt1
[root@lola-9 ~]# ^Ctl debug_kernel 
[root@lola-9 ~]# date
Mon May  2 00:43:52 PDT 2016
[root@lola-9 ~]# lctl debug_kernel /tmp/lustre-log.2016-05-02.0044
Debug log: 177060 lines, 177060 kept, 0 dropped, 0 bad.
[root@lola-9 ~]# mount
/dev/sdk1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
10.4.0.1:/export/scratch on /scratch type nfs (rw,addr=10.4.0.1)
10.4.0.1:/home on /home type nfs (rw,addr=10.4.0.1)
nfsd on /proc/fs/nfsd type nfsd (rw)
/dev/mapper/360080e50002ff4f00000026952013088p1 on /mnt/soaked-mdt0 type lustre (rw,user_xattr)
/dev/mapper/360080e50002ff4f00000026d52013098p1 on /mnt/soaked-mdt1 type lustre (rw,user_xattr)

### one more time
[root@lola-9 ~]# umount /mnt/soaked-mdt1
[root@lola-9 ~]# umount /mnt/soaked-mdt0
[root@lola-9 ~]# lustre_rmmod
[root@lola-9 ~]# mount -t lustre -o rw,user_xattr /dev/mapper/360080e50002ff4f00000026952013088p1 /mnt/soaked-mdt0
[root@lola-9 ~]# echo $?
0
[root@lola-9 ~]# mount -t lustre -o rw,user_xattr /dev/mapper/360080e50002ff4f00000026d52013098p1 /mnt/soaked-mdt1
[root@lola-9 ~]# echo $?
0
[root@lola-9 ~]# mount
/dev/sdk1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
10.4.0.1:/export/scratch on /scratch type nfs (rw,addr=10.4.0.1)
10.4.0.1:/home on /home type nfs (rw,addr=10.4.0.1)
nfsd on /proc/fs/nfsd type nfsd (rw)
/dev/mapper/360080e50002ff4f00000026952013088p1 on /mnt/soaked-mdt0 type lustre (rw,user_xattr)
/dev/mapper/360080e50002ff4f00000026d52013098p1 on /mnt/soaked-mdt1 type lustre (rw,user_xattr)
[root@lola-9 ~]# date
Mon May  2 00:49:25 PDT 2016
[root@lola-9 ~]# lctl debug_kernel /tmp/lustre-log.2016-05-02-0049
Debug log: 78514 lines, 78514 kept, 0 dropped, 0 bad.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;b&gt;--&amp;gt; attached&lt;/b&gt; lustre-log.2016-05-02.0044, lustre-log.2016-05-02-0049&lt;/p&gt;
&lt;h3&gt;&lt;a name=&quot;mountmdt0onlola8anddofailoveragain&quot;&gt;&lt;/a&gt;mount mdt-0 on lola-8 and do failover again&lt;/h3&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# Do failoverback to lola-8
[root@lola-8 ~]# mount -t lustre -o rw,user_xattr /dev/mapper/360080e50002ff4f00000026952013088p1 /mnt/soaked-mdt0
mount.lustre: increased /sys/block/dm-8/queue/max_sectors_kb from 1024 to 16383
mount.lustre: increased /sys/block/dm-4/queue/max_sectors_kb from 1024 to 16383
mount.lustre: increased /sys/block/sdh/queue/max_sectors_kb from 1024 to 16383
mount.lustre: increased /sys/block/sdc/queue/max_sectors_kb from 1024 to 16383
[root@lola-8 ~]# echo $?
0
[root@lola-8 ~]# lctl dl
  0 UP osd-ldiskfs soaked-MDT0000-osd soaked-MDT0000-osd_UUID 35
  1 UP mgs MGS MGS 7
  2 UP mgc MGC192.168.1.108@o2ib10 31b8074d-c86c-18f5-40f6-bbc2e7cf6f72 5
  3 UP mds MDS MDS_uuid 3
  4 UP lod soaked-MDT0000-mdtlov soaked-MDT0000-mdtlov_UUID 4
  5 UP mdt soaked-MDT0000 soaked-MDT0000_UUID 45
  6 UP mdd soaked-MDD0000 soaked-MDD0000_UUID 4
  7 UP qmt soaked-QMT0000 soaked-QMT0000_UUID 4
  8 UP osp soaked-MDT0001-osp-MDT0000 soaked-MDT0000-mdtlov_UUID 5
  9 UP osp soaked-MDT0002-osp-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 10 UP osp soaked-MDT0003-osp-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 11 UP osp soaked-OST0000-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 12 UP osp soaked-OST0001-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 13 UP osp soaked-OST0002-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 14 UP osp soaked-OST0003-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 15 UP osp soaked-OST0004-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 16 UP osp soaked-OST0005-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 17 UP osp soaked-OST0006-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 18 UP osp soaked-OST0007-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 19 UP osp soaked-OST0008-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 20 UP osp soaked-OST0009-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 21 UP osp soaked-OST000a-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 22 UP osp soaked-OST000b-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 23 UP osp soaked-OST000c-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 24 UP osp soaked-OST000d-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 25 UP osp soaked-OST000e-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 26 UP osp soaked-OST000f-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 27 UP osp soaked-OST0010-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 28 UP osp soaked-OST0011-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 29 UP osp soaked-OST0012-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 30 UP osp soaked-OST0013-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 31 UP osp soaked-OST0014-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 32 UP osp soaked-OST0015-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 33 UP osp soaked-OST0016-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 34 UP osp soaked-OST0017-osc-MDT0000 soaked-MDT0000-mdtlov_UUID 5
 35 UP lwp soaked-MDT0000-lwp-MDT0000 soaked-MDT0000-lwp-MDT0000_UUID 5


---&amp;gt;  now do manual hard failover again
[root@lola-9 ~]#  mount -t lustre -o rw,user_xattr /dev/mapper/360080e50002ff4f00000026952013088p1 /mnt/soaked-mdt0
mount.lustre: mount /dev/mapper/360080e50002ff4f00000026952013088p1 at /mnt/soaked-mdt0 failed: File exists
[root@lola-9 ~]# date
Mon May  2 01:05:13 PDT 2016
[root@lola-9 ~]# lctl debug_kernel /tmp/lustre-log.2016-05-02-0105
Debug log: 560 lines, 560 kept, 0 dropped, 0 bad.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;b&gt;--&amp;gt; attached file&lt;/b&gt; lustre-log.2016-05-02-0105&lt;/p&gt;
&lt;h3&gt;&lt;a name=&quot;umountmdt1andrestartLustreagain&quot;&gt;&lt;/a&gt;umount mdt-1 and restart Lustre again&lt;/h3&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# ---&amp;gt; no restart lustre again on secondary node:
[root@lola-9 ~]# umount /mnt/soaked-mdt1
[root@lola-9 ~]# lustre_rmmod
[root@lola-9 ~]# mount -t lustre -o rw,user_xattr /dev/mapper/360080e50002ff4f00000026d52013098p1 /mnt/soaked-mdt1
[root@lola-9 ~]# echo $?
0
[root@lola-9 ~]# mount -t lustre -o rw,user_xattr /dev/mapper/360080e50002ff4f00000026952013088p1 /mnt/soaked-mdt0
[root@lola-9 ~]# echo $?
0
[root@lola-9 ~]# mount
/dev/sdk1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
10.4.0.1:/export/scratch on /scratch type nfs (rw,addr=10.4.0.1)
10.4.0.1:/home on /home type nfs (rw,addr=10.4.0.1)
nfsd on /proc/fs/nfsd type nfsd (rw)
/dev/mapper/360080e50002ff4f00000026d52013098p1 on /mnt/soaked-mdt1 type lustre (rw,user_xattr)
/dev/mapper/360080e50002ff4f00000026952013088p1 on /mnt/soaked-mdt0 type lustre (rw,user_xattr)
[root@lola-9 ~]# date
Mon May  2 01:09:20 PDT 2016
[root@lola-9 ~]# lctl debug_kernel /tmp/lustre-log.2016-05-02-0109
Debug log: 119979 lines, 119979 kept, 0 dropped, 0 bad.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;b&gt;--&amp;gt; attached file&lt;/b&gt; lustre-log.2016-05-02-0109&lt;/p&gt;</description>
                <environment>lola&lt;br/&gt;
build: master commit 71d2ea0fde17ecde0bf237f486d4bafb5d54fe3f + patches</environment>
        <key id="36484">LU-8089</key>
            <summary>MGT/MDT mount fails on secondary HA node</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="tappro">Mikhail Pershin</assignee>
                                    <reporter username="heckes">Frank Heckes</reporter>
                        <labels>
                            <label>soak</label>
                    </labels>
                <created>Mon, 2 May 2016 09:12:28 +0000</created>
                <updated>Thu, 14 Jun 2018 21:41:17 +0000</updated>
                            <resolved>Tue, 25 Oct 2016 04:10:56 +0000</resolved>
                                    <version>Lustre 2.9.0</version>
                                    <fixVersion>Lustre 2.9.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>11</watches>
                                                                            <comments>
                            <comment id="150677" author="heckes" created="Mon, 2 May 2016 09:22:44 +0000"  >&lt;p&gt;This seems to affect the MGT only. Failover for other MDSes (MDTs) run smoothly.&lt;/p&gt;</comment>
                            <comment id="150721" author="di.wang" created="Mon, 2 May 2016 17:17:13 +0000"  >&lt;p&gt;In lustre-log.2016-05-02-0105, it fails to restart the new MGC on lola-9 because of the mgc with same name already exists. But the original MGC should be MGC192.168.1.108@o2ib10, because the original MGS is 108. Frank, could you please list device on lola-9, before doing failover on lola-8. Thanks.  (Or it is a failover MGC?)&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000020:00000080:16.0:1462176165.502813:0:219876:0:(obd_config.c:362:class_attach()) attach type mgc name: MGC192.168.1.109@o2ib10 uuid: 523ef026-efaa-85c3-2ecd-2c5b9b575258
00000020:00020000:16.0:1462176165.502820:0:219876:0:(genops.c:334:class_newdev()) Device MGC192.168.1.109@o2ib10 already exists at 2, won&apos;t add
00000020:00020000:16.0:1462176165.514006:0:219876:0:(obd_config.c:370:class_attach()) Cannot create device MGC192.168.1.109@o2ib10 of type mgc : -17
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="150736" author="jgmitter" created="Mon, 2 May 2016 17:53:19 +0000"  >&lt;p&gt;Hi Mike,&lt;/p&gt;

&lt;p&gt;We are moving patch &lt;a href=&quot;http://review.whamcloud.com/#/c/13726/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/13726/&lt;/a&gt; to this ticket since it was added to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4214&quot; title=&quot;Hyperion - OST never recovers on failover node&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4214&quot;&gt;&lt;del&gt;LU-4214&lt;/del&gt;&lt;/a&gt; after the ticket was already closed and is directly related to this issue seen here.  Can you please have a look?&lt;/p&gt;

&lt;p&gt;Thanks.&lt;br/&gt;
Joe&lt;/p&gt;</comment>
                            <comment id="150737" author="adilger" created="Mon, 2 May 2016 17:56:14 +0000"  >&lt;p&gt;Moved patch here from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4214&quot; title=&quot;Hyperion - OST never recovers on failover node&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4214&quot;&gt;&lt;del&gt;LU-4214&lt;/del&gt;&lt;/a&gt; since that ticket was closed but the patch was never landed to master:&lt;/p&gt;

&lt;p&gt;Mike Pershin (mike.pershin@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/13726&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13726&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8089&quot; title=&quot;MGT/MDT mount fails on secondary HA node&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8089&quot;&gt;&lt;del&gt;LU-8089&lt;/del&gt;&lt;/a&gt; lwp: fix LWP client connect logic&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 8abaa93afd61f6c28e15e035a1a06ecf7f6d748e&lt;/p&gt;</comment>
                            <comment id="159304" author="heckes" created="Wed, 20 Jul 2016 13:06:08 +0000"  >&lt;p&gt;The error didn&apos;t occurred for soak test of build &lt;a href=&quot;https://build.hpdd.intel.com/job/lustre-master/3406&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://build.hpdd.intel.com/job/lustre-master/3406&lt;/a&gt; (see &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160713&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160713&lt;/a&gt;) during a test session that is ongoing and last already 7 days.&lt;/p&gt;</comment>
                            <comment id="159495" author="heckes" created="Thu, 21 Jul 2016 15:33:22 +0000"  >&lt;p&gt;During the night the error (or at least symptom) happened again for build mentioned in previous box (lustre-master #3406).&lt;br/&gt;
Timing of events:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;2016-07-20 08:08:17  - failover started for MDS node lola-8 (MDT0000 == mgs)&lt;/li&gt;
	&lt;li&gt;2016-07-20 08:14:57  - MDT0000 mount command started on secondary node (lola-9)&lt;/li&gt;
	&lt;li&gt;2016-07-21 01:46       - mount command nor finished on lola-9 ; node booted after creating stack trace via sysrq-trigger&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The messages inside the debug log files look different then in those from the previous event. So eventually this is a new bug.&lt;/p&gt;

&lt;p&gt;Attached files:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;messages, console, debug log files of MDS &lt;tt&gt;lola-9&lt;/tt&gt;&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="170866" author="gerrit" created="Tue, 25 Oct 2016 02:23:04 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/13726/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13726/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8089&quot; title=&quot;MGT/MDT mount fails on secondary HA node&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8089&quot;&gt;&lt;del&gt;LU-8089&lt;/del&gt;&lt;/a&gt; lwp: change lwp export only at first connect&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 00ffad7af3bc7295797efd21eb16e7deaa715f45&lt;/p&gt;</comment>
                            <comment id="170895" author="pjones" created="Tue, 25 Oct 2016 04:10:56 +0000"  >&lt;p&gt;Landed for 2.9&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="21872">LU-4214</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="21349" name="console-lola-9.log-20160501.bz2" size="184863" author="heckes" created="Mon, 2 May 2016 09:36:19 +0000"/>
                            <attachment id="22309" name="console-lola-9.log.20170720.bz2" size="302590" author="heckes" created="Thu, 21 Jul 2016 15:41:35 +0000"/>
                            <attachment id="21348" name="console-lola-9.log.bz2" size="22213" author="heckes" created="Mon, 2 May 2016 09:36:18 +0000"/>
                            <attachment id="22307" name="lustre-log-lola-9-20160720_0842-mount-mgs-hangs.bz2" size="36104" author="heckes" created="Thu, 21 Jul 2016 15:41:35 +0000"/>
                            <attachment id="22310" name="lustre-log-lola-9-20160721_0146-mount-mgs-hangs.bz2" size="983352" author="heckes" created="Thu, 21 Jul 2016 15:41:35 +0000"/>
                            <attachment id="21357" name="lustre-log.1461964594.5065.bz2" size="342392" author="heckes" created="Mon, 2 May 2016 09:39:24 +0000"/>
                            <attachment id="21358" name="lustre-log.1461964707.7204.bz2" size="95864" author="heckes" created="Mon, 2 May 2016 09:39:24 +0000"/>
                            <attachment id="21353" name="lustre-log.2016-05-02-0040.bz2" size="20614" author="heckes" created="Mon, 2 May 2016 09:39:24 +0000"/>
                            <attachment id="21354" name="lustre-log.2016-05-02-0049.bz2" size="637510" author="heckes" created="Mon, 2 May 2016 09:39:24 +0000"/>
                            <attachment id="21355" name="lustre-log.2016-05-02-0105.bz2" size="8420" author="heckes" created="Mon, 2 May 2016 09:39:24 +0000"/>
                            <attachment id="21356" name="lustre-log.2016-05-02-0109.bz2" size="740496" author="heckes" created="Mon, 2 May 2016 09:39:24 +0000"/>
                            <attachment id="21352" name="lustre-log.2016-05-02.0044.bz2" size="1352237" author="heckes" created="Mon, 2 May 2016 09:39:24 +0000"/>
                            <attachment id="21351" name="messages-lola-9.log-20160501.bz2" size="644039" author="heckes" created="Mon, 2 May 2016 09:36:19 +0000"/>
                            <attachment id="22308" name="messages-lola-9.log.20170720.bz2" size="283863" author="heckes" created="Thu, 21 Jul 2016 15:41:35 +0000"/>
                            <attachment id="21350" name="messages-lola-9.log.bz2" size="241050" author="heckes" created="Mon, 2 May 2016 09:36:19 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzy9v3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>