<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:12:14 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-970] Invalid Import messages</title>
                <link>https://jira.whamcloud.com/browse/LU-970</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We receive many messages like:&lt;/p&gt;

&lt;p&gt;Jan  8 04:21:55 osiride-lp-030 kernel: LustreError: 11463:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID  req@ffff810a722ccc00 x1388786345868037/t0 o101-&amp;gt;MGS@MGC10.121.13.31@tcp_0:26/25 lens 296/544 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0&lt;br/&gt;
Jan  8 04:21:55 osiride-lp-030 kernel: LustreError: 11463:0:(client.c:858:ptlrpc_import_delay_req()) Skipped 179 previous similar messages&lt;br/&gt;
Jan  8 04:22:38 osiride-lp-030 kernel: Lustre: 6743:0:(client.c:1487:ptlrpc_expire_one_request()) @@@ Request x1388786345868061 sent from MGC10.121.13.31@tcp to NID 0@lo 5s ago has timed out (5s prior to deadline).&lt;br/&gt;
Jan  8 04:22:38 osiride-lp-030 kernel:   req@ffff810256529800 x1388786345868061/t0 o250-&amp;gt;MGS@MGC10.121.13.31@tcp_0:26/25 lens 368/584 e 0 to 1 dl 1325992958 ref 1 fl Rpc:N/0/0 rc 0/0&lt;br/&gt;
Jan  8 04:22:38 osiride-lp-030 kernel: Lustre: 6743:0:(client.c:1487:ptlrpc_expire_one_request()) Skipped 100 previous similar messages&lt;/p&gt;


&lt;p&gt;I have attached the &quot;messages&quot; of the MDS/MGS server.&lt;/p&gt;

&lt;p&gt;Can you explain the meaning of these messages and how could we fix it?&lt;/p&gt;
</description>
                <environment>Lustre version: 1.8.5.54-20110316022453-PRISTINE-2.6.18-194.17.1.el5_lustre.20110315140510&lt;br/&gt;
lctl version: 1.8.5.54-20110316022453-PRISTINE-2.6.18-194.17.1.el5_lustre.20110315140510&lt;br/&gt;
Red Hat Enterprise Linux Server release 5.4 (Tikanga)&lt;br/&gt;
auth type over ldap and kerberos&lt;br/&gt;
quota enabled only for group on lustre fs&lt;br/&gt;
</environment>
        <key id="12835">LU-970</key>
            <summary>Invalid Import messages</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="lustre.support">Supporto Lustre Jnet2000</reporter>
                        <labels>
                            <label>log</label>
                            <label>server</label>
                    </labels>
                <created>Mon, 9 Jan 2012 05:23:05 +0000</created>
                <updated>Thu, 2 Feb 2012 13:18:06 +0000</updated>
                            <resolved>Thu, 2 Feb 2012 13:18:06 +0000</resolved>
                                    <version>Lustre 1.8.x (1.8.0 - 1.8.5)</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="26189" author="johann" created="Mon, 9 Jan 2012 10:06:22 +0000"  >&lt;p&gt;This means that the MDT somehow cannot reach the MGS which is supposed to run locally.&lt;br/&gt;
Could you please run the following commands on this server and attach the output to this ticket?&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;lctl dl&lt;/li&gt;
	&lt;li&gt;lctl get_param mgc.*.import&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Also, you mentioned that you are running &quot;1.8.5.54-20110316022453&quot;. Do i understand correctly that you are running a beta version of Oracle&apos;s 1.8.6 which isn&apos;t intended to be used in production? If so, i would really advise to upgrade to 1.8.7-wc1. &lt;/p&gt;</comment>
                            <comment id="26195" author="pjones" created="Mon, 9 Jan 2012 10:15:38 +0000"  >&lt;p&gt;Bobi&lt;/p&gt;

&lt;p&gt;Could you please take care of this ticket?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="26204" author="lustre.support" created="Mon, 9 Jan 2012 11:17:08 +0000"  >&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@osiride-lp-031 wisi281&amp;#93;&lt;/span&gt;# lctl dl&lt;br/&gt;
  0 UP mgc MGC10.121.13.31@tcp 326e50f4-053e-14d7-29f8-10a8ae98140d 5&lt;br/&gt;
  1 UP ost OSS OSS_uuid 3&lt;br/&gt;
  2 UP obdfilter home-OST0003 home-OST0003_UUID 43&lt;br/&gt;
  3 UP mgs MGS MGS 43&lt;br/&gt;
  4 UP mgc MGC10.121.13.62@tcp 8ac4b17d-d00e-1d89-9281-3d1615a38949 5&lt;br/&gt;
  5 UP mdt MDS MDS_uuid 3&lt;br/&gt;
  6 UP lov home-mdtlov home-mdtlov_UUID 4&lt;br/&gt;
  7 UP mds home-MDT0000 home-MDT0000_UUID 41&lt;br/&gt;
  8 UP osc home-OST0000-osc home-mdtlov_UUID 5&lt;br/&gt;
  9 UP osc home-OST0001-osc home-mdtlov_UUID 5&lt;br/&gt;
 10 UP osc home-OST0002-osc home-mdtlov_UUID 5&lt;br/&gt;
 11 UP osc home-OST0003-osc home-mdtlov_UUID 5&lt;br/&gt;
 12 UP osc home-OST0004-osc home-mdtlov_UUID 5&lt;br/&gt;
 13 UP osc home-OST0005-osc home-mdtlov_UUID 5&lt;br/&gt;
 14 UP osc home-OST0006-osc home-mdtlov_UUID 5&lt;br/&gt;
 15 UP osc home-OST0007-osc home-mdtlov_UUID 5&lt;br/&gt;
 16 UP osc home-OST0008-osc home-mdtlov_UUID 5&lt;br/&gt;
 17 UP osc home-OST0009-osc home-mdtlov_UUID 5&lt;br/&gt;
 18 UP osc home-OST000a-osc home-mdtlov_UUID 5&lt;br/&gt;
 19 UP osc home-OST000b-osc home-mdtlov_UUID 5&lt;br/&gt;
 20 UP obdfilter home-OST0000 home-OST0000_UUID 43&lt;br/&gt;
 21 UP obdfilter home-OST0001 home-OST0001_UUID 43&lt;br/&gt;
 22 UP obdfilter home-OST0002 home-OST0002_UUID 43&lt;br/&gt;
 23 UP obdfilter home-OST0005 home-OST0005_UUID 43&lt;br/&gt;
 24 UP obdfilter home-OST000a home-OST000a_UUID 43&lt;br/&gt;
 25 UP obdfilter home-OST0008 home-OST0008_UUID 43&lt;br/&gt;
 26 UP obdfilter home-OST0006 home-OST0006_UUID 43&lt;br/&gt;
 27 UP obdfilter home-OST000b home-OST000b_UUID 43&lt;br/&gt;
 28 UP obdfilter home-OST0009 home-OST0009_UUID 43&lt;br/&gt;
 29 UP obdfilter home-OST0004 home-OST0004_UUID 43&lt;br/&gt;
 30 UP obdfilter home-OST0007 home-OST0007_UUID 43&lt;/p&gt;</comment>
                            <comment id="26205" author="lustre.support" created="Mon, 9 Jan 2012 11:17:47 +0000"  >&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@osiride-lp-031 wisi281&amp;#93;&lt;/span&gt;# lctl get_param mgc.*.import&lt;br/&gt;
mgc.MGC10.121.13.31@tcp.import=&lt;br/&gt;
import:&lt;br/&gt;
    name: MGC10.121.13.31@tcp&lt;br/&gt;
    target: MGS&lt;br/&gt;
    state: CONNECTING&lt;br/&gt;
    connect_flags: &lt;span class=&quot;error&quot;&gt;&amp;#91;version, adaptive_timeouts, fid_is_enabled&amp;#93;&lt;/span&gt;&lt;br/&gt;
    import_flags: [ no_recov, invalid, replayable, pingable, recon_bk,&lt;br/&gt;
last_recon]&lt;br/&gt;
    connection:&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       current_connection: 10.121.13.31@tcp&lt;br/&gt;
       connection_attempts: 239371&lt;br/&gt;
       generation: 478755&lt;br/&gt;
       in-progress_invalidations: 0&lt;br/&gt;
    rpcs:&lt;br/&gt;
       inflight: 1&lt;br/&gt;
       unregistering: 0&lt;br/&gt;
       timeouts: 239370&lt;br/&gt;
       avg_waittime: 0 &amp;lt;NULL&amp;gt;&lt;br/&gt;
    service_estimates:&lt;br/&gt;
       services: 1 sec&lt;br/&gt;
       network: 1 sec&lt;br/&gt;
    transactions:&lt;br/&gt;
       last_replay: 0&lt;br/&gt;
       peer_committed: 0&lt;br/&gt;
       last_checked: 0&lt;br/&gt;
mgc.MGC10.121.13.62@tcp.import=&lt;br/&gt;
import:&lt;br/&gt;
    name: MGC10.121.13.62@tcp&lt;br/&gt;
    target: MGS&lt;br/&gt;
    state: FULL&lt;br/&gt;
    connect_flags: &lt;span class=&quot;error&quot;&gt;&amp;#91;version, adaptive_timeouts&amp;#93;&lt;/span&gt;&lt;br/&gt;
    import_flags: &lt;span class=&quot;error&quot;&gt;&amp;#91;pingable, recon_bk&amp;#93;&lt;/span&gt;&lt;br/&gt;
    connection:&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;0@lo&amp;#93;&lt;/span&gt;&lt;br/&gt;
       current_connection: 0@lo&lt;br/&gt;
       connection_attempts: 1&lt;br/&gt;
       generation: 1&lt;br/&gt;
       in-progress_invalidations: 0&lt;br/&gt;
    rpcs:&lt;br/&gt;
       inflight: 0&lt;br/&gt;
       unregistering: 0&lt;br/&gt;
       timeouts: 0&lt;br/&gt;
       avg_waittime: 0 &amp;lt;NULL&amp;gt;&lt;br/&gt;
    service_estimates:&lt;br/&gt;
       services: 1 sec&lt;br/&gt;
       network: 1 sec&lt;br/&gt;
    transactions:&lt;br/&gt;
       last_replay: 0&lt;br/&gt;
       peer_committed: 0&lt;br/&gt;
       last_checked: 0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@osiride-lp-031 wisi281&amp;#93;&lt;/span&gt;#&lt;/p&gt;
</comment>
                            <comment id="26231" author="lustre.support" created="Tue, 10 Jan 2012 06:45:52 +0000"  >&lt;p&gt;Hi,&lt;br/&gt;
other information on our setup. We have two lustre servers: &lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;osiride-lp-030 -&amp;gt;10.121.13.31&lt;/li&gt;
	&lt;li&gt;osiride-lp-031 -&amp;gt;10.121.13.62&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The first server hosts these services:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;MGS&lt;/li&gt;
	&lt;li&gt;MDS&lt;/li&gt;
	&lt;li&gt;OST00&lt;/li&gt;
	&lt;li&gt;OST01&lt;/li&gt;
	&lt;li&gt;OST02&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The second server hosts services OST03 to OST0b&lt;/p&gt;

&lt;p&gt;We have a dedicated 10GbE using  Broadcom Corporation NetXtreme II BCM57711E 10-Gigabit PCIe&lt;/p&gt;

&lt;p&gt;We have a Red Hat Cluster Suite cluster to provide High Availability.&lt;/p&gt;

&lt;p&gt;The output of &quot;lctl dl&quot; and &quot;lctl get_param mgc.*.import&quot; is taken after a failover and all the Lustre services are hosted on the osiride-lp-031 server. We have the same messages on the osiride-lp-031. I have attached the &quot;messages&quot; of osiride-lp-031 before and after the failover.&lt;/p&gt;

&lt;p&gt;Thanks in advance&lt;/p&gt;

</comment>
                            <comment id="26232" author="lustre.support" created="Tue, 10 Jan 2012 06:46:52 +0000"  >&lt;p&gt;messages of osiride-lp-031&lt;/p&gt;</comment>
                            <comment id="26233" author="lustre.support" created="Tue, 10 Jan 2012 06:53:06 +0000"  >&lt;p&gt;We are planning to upgrade to the latest GA version of Lustre at the end of January.&lt;/p&gt;</comment>
                            <comment id="26241" author="lustre.support" created="Tue, 10 Jan 2012 09:38:45 +0000"  >&lt;p&gt;Hi,&lt;br/&gt;
Is it normal to have on the same server for 1 exported Lustre file system two mgc entries?&lt;/p&gt;

&lt;p&gt;&amp;gt;&amp;gt; 0 UP mgc MGC10.121.13.31@tcp 326e50f4-053e-14d7-29f8-10a8ae98140d 5&lt;br/&gt;
&amp;gt;&amp;gt; 4 UP mgc MGC10.121.13.62@tcp 8ac4b17d-d00e-1d89-9281-3d1615a38949 5&lt;/p&gt;

&lt;p&gt;Thanks in advance for your support&lt;/p&gt;</comment>
                            <comment id="26248" author="johann" created="Tue, 10 Jan 2012 10:18:30 +0000"  >&lt;p&gt;This indeed looks weird. Could you please run the following commands?&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&quot;tunefs.lustre --print $dev&quot; against all OSTs &amp;amp; MDT devices&lt;/li&gt;
	&lt;li&gt;&quot;mount&quot;&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="26250" author="bobijam" created="Tue, 10 Jan 2012 10:45:39 +0000"  >&lt;p&gt;Could be the MDT device or some OST devices being mkfs.lustre-ed with wrong &quot;--mgsnode&quot; argument.&lt;/p&gt;</comment>
                            <comment id="26307" author="lustre.support" created="Tue, 10 Jan 2012 17:56:39 +0000"  >&lt;p&gt;Hi Zhenyu,&lt;br/&gt;
I&apos;m 100% sure that there are no mkfs.lustre mistake.&lt;/p&gt;

&lt;p&gt;Hi Johann,&lt;br/&gt;
this is the output of &quot;mount&quot; on osiride-lp-031:&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;root@osiride-lp-031 ~&amp;#93;&lt;/span&gt;# mount&lt;br/&gt;
/dev/mapper/vg_lp-lv_root on / type ext3 (rw)&lt;br/&gt;
proc on /proc type proc (rw)&lt;br/&gt;
sysfs on /sys type sysfs (rw)&lt;br/&gt;
devpts on /dev/pts type devpts (rw,gid=5,mode=620)&lt;br/&gt;
/dev/mapper/vg_lp-lv_tmp on /tmp type ext3 (rw)&lt;br/&gt;
/dev/mapper/vg_lp-lv_var on /var type ext3 (rw)&lt;br/&gt;
/dev/mapper/vg_lp-lv_vartmp on /var/tmp type ext3 (rw)&lt;br/&gt;
/dev/mapper/vg_lp-lv_varwww on /var/www type ext3 (rw)&lt;br/&gt;
/dev/mapper/vg_lp-lv_home on /home type ext3 (rw)&lt;br/&gt;
/dev/mapper/vg_lp-lv_varlibxen on /var/lib/xen type ext3 (rw)&lt;br/&gt;
/dev/mapper/vg_lp-lv_varlog on /var/log type ext3 (rw)&lt;br/&gt;
/dev/mapper/vg_lp-lv_varlibmysql on /var/lib/mysql type ext3 (rw)&lt;br/&gt;
/dev/mapper/vg_lp-lv_tmp_work on /tmp/work type ext3 (rw)&lt;br/&gt;
/dev/mapper/vg_lp-lv_opt on /opt type ext3 (rw)&lt;br/&gt;
/dev/cciss/c0d0p1 on /boot type ext3 (rw)&lt;br/&gt;
tmpfs on /dev/shm type tmpfs (rw)&lt;br/&gt;
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)&lt;br/&gt;
none on /sys/kernel/config type configfs (rw)&lt;br/&gt;
/dev/mpath/mgsp1 on /lustre/mgs type lustre (rw)&lt;br/&gt;
/dev/mpath/mdtp1 on /lustre/mdt type lustre (rw,acl)&lt;br/&gt;
/dev/mpath/ost00p1 on /lustre/ost00 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost01p1 on /lustre/ost01 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost02p1 on /lustre/ost02 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost03p1 on /lustre/ost03 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost05p1 on /lustre/ost05 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost10p1 on /lustre/ost10 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost08p1 on /lustre/ost08 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost06p1 on /lustre/ost06 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost11p1 on /lustre/ost11 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost09p1 on /lustre/ost09 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost04p1 on /lustre/ost04 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost07p1 on /lustre/ost07 type lustre (rw)&lt;/p&gt;


&lt;p&gt;I&apos;m not able to take the output of tunefs.lustre because this problem:&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@osiride-lp-031 ~&amp;#93;&lt;/span&gt;# tunefs.lustre --print /dev/mpath/ost07p1&lt;br/&gt;
checking for existing Lustre data: not found&lt;/p&gt;

&lt;p&gt;tunefs.lustre FATAL: Device /dev/mpath/ost07p1 has not been formatted with&lt;br/&gt;
mkfs.lustre&lt;br/&gt;
tunefs.lustre: exiting with 19 (No such device)&lt;/p&gt;


&lt;p&gt;I try on the /dev/dm-31 that are the real block device, but I receive the same error. &lt;/p&gt;</comment>
                            <comment id="26308" author="lustre.support" created="Tue, 10 Jan 2012 18:07:08 +0000"  >&lt;p&gt;Do you think that having this two entries in the device list table is the cause of the errors that I receive in the &quot;messages&quot;?&lt;br/&gt;
Is it possible that something going wrong with the failover?&lt;br/&gt;
If I restart the lustre platforma, should I fix this problem?&lt;/p&gt;

&lt;p&gt;Thanks in advance &lt;/p&gt;</comment>
                            <comment id="26321" author="bobijam" created="Tue, 10 Jan 2012 21:51:54 +0000"  >&lt;p&gt;Please umount /dev/mpath/ost07p1 and mount it as &apos;ldiskfs&apos; type, and upload its &quot;CONFIGS/mountdata&quot; file here.&lt;/p&gt;

&lt;p&gt;And check whether the &quot;Invalid Import&quot; messages persist as ost07 is &quot;offline&quot; the filesystem.  &lt;/p&gt;</comment>
                            <comment id="26323" author="lustre.support" created="Tue, 10 Jan 2012 22:48:53 +0000"  >&lt;p&gt;Sorry Zhenyu but I receive the tunefs.lustre error on all the lustre&apos;s block devices :&lt;/p&gt;

&lt;p&gt;/dev/mpath/mgsp1 on /lustre/mgs type lustre (rw)&lt;br/&gt;
/dev/mpath/mdtp1 on /lustre/mdt type lustre (rw,acl)&lt;br/&gt;
/dev/mpath/ost00p1 on /lustre/ost00 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost01p1 on /lustre/ost01 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost02p1 on /lustre/ost02 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost03p1 on /lustre/ost03 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost05p1 on /lustre/ost05 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost10p1 on /lustre/ost10 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost08p1 on /lustre/ost08 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost06p1 on /lustre/ost06 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost11p1 on /lustre/ost11 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost09p1 on /lustre/ost09 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost04p1 on /lustre/ost04 type lustre (rw)&lt;br/&gt;
/dev/mpath/ost07p1 on /lustre/ost07 type lustre (rw)&lt;/p&gt;

&lt;p&gt;Should I use tunefs.lustre with the real scsi disk device and not on the multipathed block devices?&lt;/p&gt;</comment>
                            <comment id="26330" author="bobijam" created="Wed, 11 Jan 2012 01:40:45 +0000"  >&lt;p&gt;Then try to tunefs.lustre upon the real scsi disk device.&lt;/p&gt;

&lt;p&gt;Or use &lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;debugfs -R &lt;span class=&quot;code-quote&quot;&gt;&quot;dump CONFIGS/mountdata /tmp/mountdata&quot;&lt;/span&gt; /dev/mpath/ost07p1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;to dump the file and upload here.&lt;/p&gt;</comment>
                            <comment id="26345" author="lustre.support" created="Wed, 11 Jan 2012 06:53:15 +0000"  >&lt;p&gt;opps &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/wink.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;

&lt;p&gt;debugfs -R &quot;dump CONFIGS/mountdata /tmp/mountdata-ost07&quot; /dev/mpath/ost07p1&lt;/p&gt;

&lt;p&gt;debugfs 1.41.10.sun2 (24-Feb-2010)&lt;br/&gt;
/dev/mpath/ost07p1: MMP: device currently active while opening filesystem&lt;br/&gt;
dump: Filesystem not open&lt;/p&gt;</comment>
                            <comment id="26362" author="johann" created="Wed, 11 Jan 2012 11:13:02 +0000"  >&lt;p&gt;This problem (i.e. debugfs cannot open the filesystem due to MMP) has been fixed in recent e2fsprogs. Could you please update the e2fsprogs package and rerun the tunefs.lustre command?&lt;br/&gt;
The latest one is e2fsprogs-1.41.90.wc3 and can be downloaded here: &lt;a href=&quot;http://downloads.whamcloud.com/public/e2fsprogs/latest&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://downloads.whamcloud.com/public/e2fsprogs/latest&lt;/a&gt;&lt;br/&gt;
This package can be updated while lustre is running.&lt;/p&gt;</comment>
                            <comment id="26476" author="lustre.support" created="Fri, 13 Jan 2012 05:24:18 +0000"  >&lt;p&gt;Thanks Johann,&lt;br/&gt;
we are waiting the authorization from the end user to upgrade the e2fsprogs.&lt;/p&gt;</comment>
                            <comment id="26479" author="lustre.support" created="Fri, 13 Jan 2012 09:07:07 +0000"  >&lt;p&gt;The end-user agree to upgrade the e2fsprogs, but we should start from the test environment. I&apos;m planning to give to you the configuration information on 17th January.&lt;/p&gt;

&lt;p&gt;See you soon!!!&lt;/p&gt;</comment>
                            <comment id="26705" author="lustre.support" created="Tue, 17 Jan 2012 08:47:54 +0000"  >&lt;p&gt;The end-user ask us to wait other 2 days to upgrade in production the e2fsprogs tools. &lt;/p&gt;

&lt;p&gt;Thanks in advance&lt;/p&gt;</comment>
                            <comment id="26790" author="lustre.support" created="Wed, 18 Jan 2012 08:28:37 +0000"  >&lt;p&gt;dumps&lt;/p&gt;</comment>
                            <comment id="26791" author="lustre.support" created="Wed, 18 Jan 2012 08:29:56 +0000"  >&lt;p&gt;Ok, we upgrade the e2fsprogs and make the dumps of the configuration. I have attached it.&lt;/p&gt;

&lt;p&gt;thanks in advance&lt;/p&gt;</comment>
                            <comment id="26799" author="bobijam" created="Wed, 18 Jan 2012 10:45:51 +0000"  >&lt;ol&gt;
	&lt;li&gt;strings mgs&lt;br/&gt;
lustre&lt;br/&gt;
acl,iopen_nopriv,user_xattr,errors=remount-ro&lt;br/&gt;
 failover.node=10.121.13.62@tcp&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;#strings mdt &lt;br/&gt;
home&lt;br/&gt;
home-MDT0000&lt;br/&gt;
...&lt;/p&gt;

&lt;p&gt;#strings ost*&lt;br/&gt;
home&lt;br/&gt;
home-OST000x  ==&amp;gt; x from 0 to b&lt;br/&gt;
...&lt;/p&gt;

&lt;p&gt;You&apos;ve formatted your devices with inconsistent fsname, this could happened when you formatted the mgs device without specifying &quot;&amp;#45;&amp;#45;fsname&quot; argument where &quot;lustre&quot; is its default value, and you formatted the other devices with &quot;&amp;#45;&amp;#45;fsname=home&quot;.&lt;/p&gt;
</comment>
                            <comment id="26802" author="lustre.support" created="Wed, 18 Jan 2012 11:22:11 +0000"  >&lt;p&gt;Could be this problem the cause of the Lustre-Error that we see in the messages?&lt;br/&gt;
How could we fix this problem?&lt;/p&gt;</comment>
                            <comment id="26804" author="bobijam" created="Wed, 18 Jan 2012 11:30:47 +0000"  >&lt;p&gt;yes, this could cause the error message.&lt;/p&gt;

&lt;p&gt;You need &quot;tunefs.lustre &amp;#45;&amp;#45;mgs &amp;#45;&amp;#45;fsname=home &amp;lt;other options&amp;gt; &amp;lt;mgs device&amp;gt;&quot; and remount it or even all other devices as well. &lt;/p&gt;</comment>
                            <comment id="26805" author="johann" created="Wed, 18 Jan 2012 11:37:24 +0000"  >&lt;p&gt;Please don&apos;t run this command for now until we can look at the output of &quot;tunefs.lustre --print&quot;.&lt;br/&gt;
Any chance to attach the output of this command against all lustre targets (MGS/MDT/OSTs devices)?&lt;/p&gt;

&lt;p&gt;Thanks in advance. &lt;/p&gt;</comment>
                            <comment id="26871" author="lustre.support" created="Thu, 19 Jan 2012 08:46:15 +0000"  >&lt;p&gt;the tunefs output&lt;/p&gt;</comment>
                            <comment id="26874" author="johann" created="Thu, 19 Jan 2012 09:17:39 +0000"  >&lt;p&gt;I am afraid that you forgot to specify the failover mgsnode when formatting the OSTs &amp;amp; MDT:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;   Read previous values:
Target:     home-OST0000
Index:      0
Lustre FS:  home
Mount type: ldiskfs
Flags:      0x2
              (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.121.13.31@tcp failover.node=10.121.13.62@tcp ost.quota_type=ug
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The OSTs and MDT should be given the full list of NIDs where the MGS can run. In your case, this is both 10.121.13.31@tcp and 10.121.13.62@tcp. This explains why the targets cannot reach the MGS when this latter runs on 10.121.13.62@tcp. That&apos;s the root cause of the error messages you see.&lt;/p&gt;

&lt;p&gt;To fix this, you would have to do the following procedure for each OST and the MDT:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;stop the target (via unmount or the HA agent)&lt;/li&gt;
	&lt;li&gt;run &quot;tunefs.lustre --mgsnode=10.121.13.62@tcp ${path/to/device}&quot;&lt;/li&gt;
	&lt;li&gt;restart the target (with mount or HA agent)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I also noticed that some OSTs have &quot;failover.node=10.121.13.62@tcp&quot; while some others have &quot;10.121.13.31@tcp&quot;.&lt;br/&gt;
To make sure that there is no problem with the OST/MDT failover configuration, could you please run the following command on one lustre client? &quot;lctl get_param &lt;/p&gt;
{mdc,osc}
&lt;p&gt;.*.import&quot;&lt;/p&gt;</comment>
                            <comment id="26877" author="lustre.support" created="Thu, 19 Jan 2012 10:36:00 +0000"  >&lt;p&gt;the lctl get_param &lt;/p&gt;
{mdc,osc}
&lt;p&gt;.*.import output&lt;/p&gt;</comment>
                            <comment id="27079" author="johann" created="Fri, 20 Jan 2012 03:33:19 +0000"  >&lt;ol&gt;
	&lt;li&gt;grep failover_nids lustre.info&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;br/&gt;
       failover_nids: &lt;span class=&quot;error&quot;&gt;&amp;#91;10.121.13.62@tcp, 10.121.13.31@tcp&amp;#93;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;The MDT/OST failover config looks fine, so you just have to fix the mgsnode issue as mentioned above.&lt;/p&gt;</comment>
                            <comment id="27083" author="lustre.support" created="Fri, 20 Jan 2012 03:43:57 +0000"  >&lt;p&gt;Johann, Should I fix the MGS configuration too?&lt;/p&gt;</comment>
                            <comment id="27085" author="bobijam" created="Fri, 20 Jan 2012 04:03:01 +0000"  >&lt;p&gt;No, there is no need to set a fs name on a separate MGT which can handle multiple filesystems at once. My fault to mentioned the incorrect info in comment on 18/Jan/12 10:45 AM.&lt;/p&gt;</comment>
                            <comment id="27086" author="lustre.support" created="Fri, 20 Jan 2012 05:54:38 +0000"  >&lt;p&gt;Hi Johann and Zhenyu, the normal configuration is:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;MGT, MDT, OST0000, OST0001, OST0002 are owned by 10.121.13.31@tcp and the failover node are 10.121.13.62@tcp&lt;/li&gt;
	&lt;li&gt;OST0003 -&amp;gt; OST000b are owned by 10.121.13.62@tcp and the failover node are 10.121.13.31@tcp&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;How should I change the configuration according this setup and to avoid the Lustre errors?&lt;/p&gt;

&lt;p&gt;When we take the tunefs output and the dumpfs output, we are in a failed situation, because all the targets are mounted on 10.121.13.62@tcp.&lt;/p&gt;

&lt;p&gt;We have the Lustre errors before and after the shutdown of the 10.121.13.31@tcp node, as you see in the messages.&lt;/p&gt;


&lt;p&gt;Thanks in advance&lt;/p&gt;</comment>
                            <comment id="27087" author="johann" created="Fri, 20 Jan 2012 07:18:33 +0000"  >&lt;p&gt;&amp;gt; How should I change the configuration according this setup and to avoid the Lustre errors?&lt;/p&gt;

&lt;p&gt;There is no need to change the configuration. Please just follow the procedure i detailed in my comment on 19/Jan/12 9:17 AM and the error messages will be gone.&lt;/p&gt;</comment>
                            <comment id="27096" author="lustre.support" created="Fri, 20 Jan 2012 10:16:15 +0000"  >&lt;p&gt;So when we start the node 10.121.13.31@tcp and rebalance the service, the Lustre error will be gone? But why we see the lustre error before the failing over of 10.121.13.31@tcp node?&lt;/p&gt;

&lt;p&gt;thanks in advance&lt;/p&gt;</comment>
                            <comment id="27103" author="johann" created="Fri, 20 Jan 2012 10:42:07 +0000"  >&lt;p&gt;I&apos;m afraid that we have not enough log of this incident to find out why the MGS wasn&apos;t responsive at this time.&lt;br/&gt;
I would suggest to fix the mgsnode configuration error and then we can look at this problem if this happens again.&lt;/p&gt;</comment>
                            <comment id="27551" author="lustre.support" created="Sat, 28 Jan 2012 04:38:42 +0000"  >&lt;p&gt;Ok,&lt;br/&gt;
we have rebalanced the services on osiride-lp030 and osiride-lp031 and restart all the client. No more Lustre error and this is the output of lctl dl command on both the servers.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@osiride-lp-030 ~&amp;#93;&lt;/span&gt;# lctl dl&lt;br/&gt;
  0 UP mgs MGS MGS 45&lt;br/&gt;
  1 UP mgc MGC10.121.13.31@tcp 5c2ce5e0-645a-2b58-6c0d-c5a9a11671f5 5&lt;br/&gt;
  2 UP ost OSS OSS_uuid 3&lt;br/&gt;
  3 UP obdfilter home-OST0001 home-OST0001_UUID 43&lt;br/&gt;
  4 UP obdfilter home-OST0002 home-OST0002_UUID 43&lt;br/&gt;
  5 UP obdfilter home-OST0000 home-OST0000_UUID 43&lt;br/&gt;
  6 UP mdt MDS MDS_uuid 3&lt;br/&gt;
  7 UP lov home-mdtlov home-mdtlov_UUID 4&lt;br/&gt;
  8 UP mds home-MDT0000 home-MDT0000_UUID 41&lt;br/&gt;
  9 UP osc home-OST0000-osc home-mdtlov_UUID 5&lt;br/&gt;
 10 UP osc home-OST0001-osc home-mdtlov_UUID 5&lt;br/&gt;
 11 UP osc home-OST0002-osc home-mdtlov_UUID 5&lt;br/&gt;
 12 UP osc home-OST0003-osc home-mdtlov_UUID 5&lt;br/&gt;
 13 UP osc home-OST0004-osc home-mdtlov_UUID 5&lt;br/&gt;
 14 UP osc home-OST0005-osc home-mdtlov_UUID 5&lt;br/&gt;
 15 UP osc home-OST0006-osc home-mdtlov_UUID 5&lt;br/&gt;
 16 UP osc home-OST0007-osc home-mdtlov_UUID 5&lt;br/&gt;
 17 UP osc home-OST0008-osc home-mdtlov_UUID 5&lt;br/&gt;
 18 UP osc home-OST0009-osc home-mdtlov_UUID 5&lt;br/&gt;
 19 UP osc home-OST000a-osc home-mdtlov_UUID 5&lt;br/&gt;
 20 UP osc home-OST000b-osc home-mdtlov_UUID 5&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@osiride-lp-031 ~&amp;#93;&lt;/span&gt;# lctl dl&lt;br/&gt;
  0 UP mgc MGC10.121.13.31@tcp e4919e7b-230b-9ce3-910d-3ec6e1bed6fc 5&lt;br/&gt;
  1 UP ost OSS OSS_uuid 3&lt;br/&gt;
  2 UP obdfilter home-OST0006 home-OST0006_UUID 43&lt;br/&gt;
  3 UP obdfilter home-OST0004 home-OST0004_UUID 43&lt;br/&gt;
  4 UP obdfilter home-OST0007 home-OST0007_UUID 43&lt;br/&gt;
  5 UP obdfilter home-OST0003 home-OST0003_UUID 43&lt;br/&gt;
  6 UP obdfilter home-OST0009 home-OST0009_UUID 43&lt;br/&gt;
  7 UP obdfilter home-OST0008 home-OST0008_UUID 43&lt;br/&gt;
  8 UP obdfilter home-OST0005 home-OST0005_UUID 43&lt;br/&gt;
  9 UP obdfilter home-OST000b home-OST000b_UUID 43&lt;br/&gt;
 10 UP obdfilter home-OST000a home-OST000a_UUID 43&lt;/p&gt;

&lt;p&gt;Could you please close the issue? Thanks in advance&lt;/p&gt;</comment>
                            <comment id="27572" author="johann" created="Mon, 30 Jan 2012 02:44:01 +0000"  >&lt;p&gt;Cool . To be clear, you&apos;ve also fixed the MGS configuration with tunefs.lustre as explained in my comment on 19/Jan/12 9:17 AM, right?&lt;/p&gt;</comment>
                            <comment id="27583" author="lustre.support" created="Mon, 30 Jan 2012 09:27:31 +0000"  >&lt;p&gt;No I have not. We are planning to upgrade the version of Lustre to the latest stable. I will change the configuration during the upgrade.&lt;/p&gt;</comment>
                            <comment id="27775" author="lustre.support" created="Thu, 2 Feb 2012 13:14:18 +0000"  >&lt;p&gt;Please close this issue&lt;/p&gt;</comment>
                            <comment id="27777" author="pjones" created="Thu, 2 Feb 2012 13:18:06 +0000"  >&lt;p&gt;Thanks!&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="10743" name="dump-18-01.tgz" size="1095" author="lustre.support" created="Wed, 18 Jan 2012 08:28:37 +0000"/>
                            <attachment id="10747" name="lustre.info" size="13890" author="lustre.support" created="Thu, 19 Jan 2012 10:36:00 +0000"/>
                            <attachment id="10725" name="messages-lp-030.bz2" size="12395" author="lustre.support" created="Mon, 9 Jan 2012 05:23:05 +0000"/>
                            <attachment id="10726" name="messages-lp-031.bz2" size="59192" author="lustre.support" created="Tue, 10 Jan 2012 06:46:52 +0000"/>
                            <attachment id="10746" name="tunefs.tgz" size="1034" author="lustre.support" created="Thu, 19 Jan 2012 08:46:15 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10040" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic</customfieldname>
                        <customfieldvalues>
                                        <label>log</label>
            <label>server</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvhk7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6494</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>