<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:12:00 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-7797] Can&apos;t mount zpools after OSS restart</title>
                <link>https://jira.whamcloud.com/browse/LU-7797</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Error happened during soak testing of build &apos;20160218&apos; (see: &lt;a href=&quot;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160218&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160218&lt;/a&gt;). DNE is enabled.&lt;br/&gt;
MDT&apos;s have been formated using &lt;em&gt;ldiskfs&lt;/em&gt;, OSTs using &lt;em&gt;zfs&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;Sequence of events:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;2016-02-18 18:24:30,824:fsmgmt.fsmgmt:INFO     executing cmd pm -h powerman -c lola-5   (restart of OSS)&lt;/li&gt;
	&lt;li&gt;Boot process hang by with several errors (see line &lt;em&gt;25015&lt;/em&gt; in console-loa-5.log, after timestamp &apos;Feb 18, 18:20:01&apos;)
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;  25105 WARNING: Pool &apos;soaked-ost11&apos; has encountered an uncorrectable I/O failure and has been suspended.
  25106 
  25107 INFO: task zpool:5003 blocked for more than 120 seconds.
  25108       Tainted: P           ---------------    2.6.32-504.30.3.el6_lustre.gf9ca359.x86_64 #1
  25109 &quot;echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs&quot; disables this message.
  25110 zpool         D 0000000000000011     0  5003   4993 0x00000000
  25111  ffff880830f7bbe8 0000000000000086 0000000000000000 ffffffff81064a6e
  25112  ffff880830f7bba8 0000000000000019 0000000d6e7b4a08 0000000000000001
  25113  ffff880830f7bb68 00000000fffc4649 ffff8808317c5068 ffff880830f7bfd8
  25114 Call Trace:
  25115  [&amp;lt;ffffffff81064a6e&amp;gt;] ? try_to_wake_up+0x24e/0x3e0
  25116  [&amp;lt;ffffffffa02e178d&amp;gt;] cv_wait_common+0x11d/0x130 [spl]
  25117  [&amp;lt;ffffffff8109ec20&amp;gt;] ? autoremove_wake_function+0x0/0x40
  25118  [&amp;lt;ffffffffa02e17f5&amp;gt;] __cv_wait+0x15/0x20 [spl]
  25119  [&amp;lt;ffffffffa039884b&amp;gt;] txg_wait_synced+0x8b/0xd0 [zfs]
  25120  [&amp;lt;ffffffffa039038c&amp;gt;] spa_config_update+0xcc/0x120 [zfs]
  25121  [&amp;lt;ffffffffa038de8a&amp;gt;] spa_import+0x56a/0x730 [zfs]
  25122  [&amp;lt;ffffffffa02fe454&amp;gt;] ? nvlist_lookup_common+0x84/0xd0 [znvpair]
  25123  [&amp;lt;ffffffffa03c0134&amp;gt;] zfs_ioc_pool_import+0xe4/0x120 [zfs]
  25124  [&amp;lt;ffffffffa03c2955&amp;gt;] zfsdev_ioctl+0x495/0x4d0 [zfs]
  25125  [&amp;lt;ffffffff811a3ff2&amp;gt;] vfs_ioctl+0x22/0xa0
  25126  [&amp;lt;ffffffff811a4194&amp;gt;] do_vfs_ioctl+0x84/0x580
  25127  [&amp;lt;ffffffff81190101&amp;gt;] ? __fput+0x1a1/0x210
  25128  [&amp;lt;ffffffff811a4711&amp;gt;] sys_ioctl+0x81/0xa0
  25129  [&amp;lt;ffffffff8100b0d2&amp;gt;] system_call_fastpath+0x16/0x1b
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;&lt;/li&gt;
	&lt;li&gt;After powercycling the node the zpool &lt;tt&gt;soaked-ost3&lt;/tt&gt; fails to mount with&lt;br/&gt;
error:
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 11505:0:(llog_obd.c:209:llog_setup()) MGC192.168.1.108@o2ib10: ctxt 0 lop_setup=ffffffffa06da310 failed: rc = -5
LustreError: 11505:0:(obd_mount_server.c:308:server_mgc_set_fs()) can&apos;t set_fs -5
LustreError: 11505:0:(obd_mount_server.c:1798:server_fill_super()) Unable to start targets: -5
LustreError: 11505:0:(obd_mount_server.c:1512:server_put_super()) no obd soaked-OST0003
LustreError: 11505:0:(obd_mount_server.c:140:server_deregister_mount()) soaked-OST0003 not registered
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The MGS is available, IB fabric operational&lt;/p&gt;&lt;/li&gt;
	&lt;li&gt;Trying to mount zpool &lt;tt&gt;soaked-ost7&lt;/tt&gt; lead to kernel panic:
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 11938:0:(obd_mount_server.c:140:server_deregister_mount()) soaked-OST0007 not registered
VERIFY3(0 == dmu_buf_hold_array(os, object, offset, size, 0, ((char *)__func__), &amp;amp;numbufs, &amp;amp;dbp)) failed (0 == 5)
PANIC at dmu.c:819:dmu_write()
Showing stack for process 9182
Pid: 9182, comm: txg_sync Tainted: P           ---------------    2.6.32-504.30.3.el6_lustre.gf9ca359.x86_64 #1
Call Trace:
 [&amp;lt;ffffffffa02df7cd&amp;gt;] ? spl_dumpstack+0x3d/0x40 [spl]
 [&amp;lt;ffffffffa02df9c2&amp;gt;] ? spl_panic+0xc2/0xe0 [spl]
 [&amp;lt;ffffffffa0349c51&amp;gt;] ? dmu_buf_hold_array_by_dnode+0x231/0x560 [zfs]
 [&amp;lt;ffffffffa035a8b4&amp;gt;] ? dnode_rele_and_unlock+0x64/0xb0 [zfs]
 [&amp;lt;ffffffffa035a943&amp;gt;] ? dnode_rele+0x43/0x50 [zfs]
 [&amp;lt;ffffffffa034a79b&amp;gt;] ? dmu_write+0x19b/0x1a0 [zfs]
 [&amp;lt;ffffffffa0342af2&amp;gt;] ? dmu_buf_will_dirty+0xb2/0x100 [zfs]
 [&amp;lt;ffffffffa0397421&amp;gt;] ? space_map_write+0x361/0x5f0 [zfs]
 [&amp;lt;ffffffffa037b01b&amp;gt;] ? metaslab_sync+0x11b/0x760 [zfs]
 [&amp;lt;ffffffffa0373cf4&amp;gt;] ? dsl_scan_sync+0x54/0xb80 [zfs]
 [&amp;lt;ffffffff8152b83e&amp;gt;] ? mutex_lock+0x1e/0x50
 [&amp;lt;ffffffffa039be3f&amp;gt;] ? vdev_sync+0x6f/0x140 [zfs]
 [&amp;lt;ffffffffa03839bb&amp;gt;] ? spa_sync+0x4bb/0xb90 [zfs]
 [&amp;lt;ffffffff81057849&amp;gt;] ? __wake_up_common+0x59/0x90
 [&amp;lt;ffffffff8105bd83&amp;gt;] ? __wake_up+0x53/0x70
 [&amp;lt;ffffffff81014a29&amp;gt;] ? read_tsc+0x9/0x20
 [&amp;lt;ffffffffa0399079&amp;gt;] ? txg_sync_thread+0x389/0x5f0 [zfs]
 [&amp;lt;ffffffffa0398cf0&amp;gt;] ? txg_sync_thread+0x0/0x5f0 [zfs]
 [&amp;lt;ffffffffa0398cf0&amp;gt;] ? txg_sync_thread+0x0/0x5f0 [zfs]
 [&amp;lt;ffffffffa02dcfb8&amp;gt;] ? thread_generic_wrapper+0x68/0x80 [spl]
 [&amp;lt;ffffffffa02dcf50&amp;gt;] ? thread_generic_wrapper+0x0/0x80 [spl]
 [&amp;lt;ffffffff8109e78e&amp;gt;] ? kthread+0x9e/0xc0
 [&amp;lt;ffffffff8100c28a&amp;gt;] ? child_rip+0xa/0x20
 [&amp;lt;ffffffff8109e6f0&amp;gt;] ? kthread+0x0/0xc0
 [&amp;lt;ffffffff8100c280&amp;gt;] ? child_rip+0x0/0x20
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Both OSTs were mounted and operational before and both error can be reproduced constantly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Attached messages and console log files of &lt;tt&gt;lola-5&lt;/tt&gt;&lt;/p&gt;</description>
                <environment>lola&lt;br/&gt;
build: 2.8.50-6-gf9ca359 ;commit f9ca359284357d145819beb08b316e932f7a3060</environment>
        <key id="34805">LU-7797</key>
            <summary>Can&apos;t mount zpools after OSS restart</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="6">Not a Bug</resolution>
                                        <assignee username="bzzz">Alex Zhuravlev</assignee>
                                    <reporter username="heckes">Frank Heckes</reporter>
                        <labels>
                            <label>soak</label>
                    </labels>
                <created>Fri, 19 Feb 2016 15:40:26 +0000</created>
                <updated>Wed, 24 Feb 2016 20:45:31 +0000</updated>
                            <resolved>Wed, 24 Feb 2016 20:45:31 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="142970" author="heckes" created="Fri, 19 Feb 2016 15:51:13 +0000"  >&lt;p&gt;The states of the zpool are as follows:&lt;/p&gt;
&lt;h3&gt;&lt;a name=&quot;soakedost3&quot;&gt;&lt;/a&gt;soaked-ost3&lt;/h3&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@lola-5 ~]# zpool status -v soaked-ost3
  pool: soaked-ost3
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: none requested
config:

        NAME                    STATE     READ WRITE CKSUM
        soaked-ost3             ONLINE       0     0     0
          raidz2-0              ONLINE       0     0     0
            lola-5_ost3_disk_0  ONLINE       0     0     0
            lola-5_ost3_disk_1  ONLINE       0     0     0
            lola-5_ost3_disk_2  ONLINE       0     0     0
            lola-5_ost3_disk_3  ONLINE       0     0     0
            lola-5_ost3_disk_4  ONLINE       0     0     0
            lola-5_ost3_disk_5  ONLINE       0     0     0
            lola-5_ost3_disk_6  ONLINE       0     0     0
            lola-5_ost3_disk_7  ONLINE       0     0     0
            lola-5_ost3_disk_8  ONLINE       0     0     0
            lola-5_ost3_disk_9  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        soaked-ost3/ost3:/oi.10
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;h3&gt;&lt;a name=&quot;soakedost7&quot;&gt;&lt;/a&gt;soaked-ost7&lt;/h3&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@lola-5 ~]# zpool status -v soaked-ost7
  pool: soaked-ost7
 state: ONLINE
  scan: none requested
config:

        NAME                    STATE     READ WRITE CKSUM
        soaked-ost7             ONLINE       0     0     0
          raidz2-0              ONLINE       0     0     0
            lola-5_ost7_disk_0  ONLINE       0     0     0
            lola-5_ost7_disk_1  ONLINE       0     0     0
            lola-5_ost7_disk_2  ONLINE       0     0     0
            lola-5_ost7_disk_3  ONLINE       0     0     0
            lola-5_ost7_disk_4  ONLINE       0     0     0
            lola-5_ost7_disk_5  ONLINE       0     0     0
            lola-5_ost7_disk_6  ONLINE       0     0     0
            lola-5_ost7_disk_7  ONLINE       0     0     0
            lola-5_ost7_disk_8  ONLINE       0     0     0
            lola-5_ost7_disk_9  ONLINE       0     0     0

errors: No known data errors
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="142972" author="heckes" created="Fri, 19 Feb 2016 15:55:49 +0000"  >&lt;p&gt;one remark: The called out zpool &apos;soaked-ost11&apos; (&apos;...has unrecoverable errros..&apos;) can be mounted and is operational after Lustre recovery completes.&lt;/p&gt;</comment>
                            <comment id="143007" author="pjones" created="Fri, 19 Feb 2016 18:41:42 +0000"  >&lt;p&gt;Alex&lt;/p&gt;

&lt;p&gt;Could you please look into how this could have occurred?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="143015" author="adilger" created="Fri, 19 Feb 2016 19:11:55 +0000"  >&lt;p&gt;Unfortunately, there is no OI scrub functionality for ZFS today, so it isn&apos;t possible to just delete the corrupted OI file and have OI Scrub rebuild it.  ZFS is not supposed to be corrupted during usage, and none of the APIs that Lustre is using to modify the filesystem should allow the pool to be corrupt.&lt;/p&gt;

&lt;p&gt;However, in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7798&quot; title=&quot;ll_prep_inode()) ASSERTION( fid_is_sane(&amp;amp;md.body-&amp;gt;mbo_fid1) ) failed:&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7798&quot;&gt;&lt;del&gt;LU-7798&lt;/del&gt;&lt;/a&gt; Frank wrote:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I forgot to mention that the OSS nodes have been extended to operate in active-active failover configuration&lt;br/&gt;
for disk resources by Feb, 17th 2016. So the failover partner node lola-4 can see of node lola-5 all disks and has its ZFS pools imported also.&lt;br/&gt;
There&apos;s no start-up (boot) wrapper script that prevents the (primary) zpools of the other node from being imported.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;ZFS is definitely vulnerable to corruption if it is mounted on multiple nodes at the same time, and there is not currently an MMP feature like ldiskfs that actively prevents it from being accessed by two nodes.  This kind of corruption would show up first for files that are being modified frequently (e.g. the OI file seen here) because each node will be assigning different blocks and updating the tree differently. There needs to be strict Linux HA control of the zpools so that they are not imported on the backup node unless they are failed over, and when failover happens there needs to be STONITH to turn off the primary node before the pool is imported on the backup to avoid concurrent access.  It is worthwhile to contact Gabriele or Zhiqi to see if there is a best-practices guide to installing ZFS in HA failover configuration.&lt;/p&gt;

&lt;p&gt;Also, see patch &lt;a href=&quot;http://review.whamcloud.com/16611&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/16611&lt;/a&gt; &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7134&quot; title=&quot;Ensure ZFS hostid protection if servicenode/failover options given to mkfs.lustre&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7134&quot;&gt;&lt;del&gt;LU-7134&lt;/del&gt;&lt;/a&gt; utils: Ensure hostid set for ZFS during mkfs&quot; which requires /etc/hostid is set and would at least provide basic protection from pool import if it is in use on another node.&lt;/p&gt;

&lt;p&gt;You &lt;em&gt;may&lt;/em&gt; be able to recover the corrupted OST zpool if it hasn&apos;t been mounted for a long time by reverting to an older uberblock or snapshot that does not have the corruption in it.  See for example &lt;a href=&quot;http://www.solarisinternals.com/wiki/index.php/ZFS_forensics_scrollback_script&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://www.solarisinternals.com/wiki/index.php/ZFS_forensics_scrollback_script&lt;/a&gt; or &lt;a href=&quot;https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSMetadataRecovery&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSMetadataRecovery&lt;/a&gt; for details.  If that doesn&apos;t work then it would be necessary to reformat the filesystem.  However, at a minimum the /etc/hostid should be set (and unique!) to prevent gratuitous imports, and failover between OSSes should be disabled until proper HA configuration is done.&lt;/p&gt;</comment>
                            <comment id="143150" author="heckes" created="Mon, 22 Feb 2016 09:53:16 +0000"  >&lt;p&gt;Andreas: Many thanks for the pointers. I&apos;ll try to fix the OSTs and enhance the soak framework to follow the HA best practices for ZFS.&lt;/p&gt;</comment>
                            <comment id="143496" author="heckes" created="Wed, 24 Feb 2016 09:38:52 +0000"  >&lt;p&gt;I think this ticket can be closed. The error was caused by the node set-up not reflecting the zfs constraints (aka importing the same zpool (OST) simultaneously on two nodes (OSSes)).&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="34806">LU-7798</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="33790">LU-7585</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="32043">LU-7134</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="20453" name="console-lola-5.log.bz2" size="251893" author="heckes" created="Fri, 19 Feb 2016 15:44:33 +0000"/>
                            <attachment id="20454" name="messages-lola-5.log.bz2" size="205272" author="heckes" created="Fri, 19 Feb 2016 15:44:33 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzy1w7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>