<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:34:21 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-3488] zfs mdt backup/recovery via snapshot</title>
                <link>https://jira.whamcloud.com/browse/LU-3488</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I am testing recovering a failed metadata server via zfs snapshot. &lt;/p&gt;

&lt;p&gt;Everything seems to go as expected, but when trying to mount the filesystem, either via the /etc/init.d/lustre start script I get an error about the backing filesystem:&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@lustre2-8-25 ~&amp;#93;&lt;/span&gt;# mount -vt lustre lustre-meta/meta /mnt/lustre/local/cove-MDT0000&lt;br/&gt;
arg&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; = /sbin/mount.lustre&lt;br/&gt;
arg&lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; = -v&lt;br/&gt;
arg&lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt; = -o&lt;br/&gt;
arg&lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; = rw&lt;br/&gt;
arg&lt;span class=&quot;error&quot;&gt;&amp;#91;4&amp;#93;&lt;/span&gt; = lustre-meta/meta&lt;br/&gt;
arg&lt;span class=&quot;error&quot;&gt;&amp;#91;5&amp;#93;&lt;/span&gt; = /mnt/lustre/local/cove-MDT0000&lt;br/&gt;
source = lustre-meta/meta (lustre-meta/meta), target = /mnt/lustre/local/cove-MDT0000&lt;br/&gt;
options = rw&lt;br/&gt;
checking for existing Lustre data: not found&lt;br/&gt;
mount.lustre: lustre-meta/meta has not been formatted with mkfs.lustre or the backend filesystem type is not supported by this tool&lt;/p&gt;

&lt;p&gt;I can manually mount it directly (posix I guess) to the filesytem with &apos;zfs mount&apos; - and things look about right. &lt;/p&gt;

&lt;p&gt;Have tried pawing through the init scripts but haven&apos;t found anything.&lt;/p&gt;

&lt;p&gt;This could perhaps be a documentation note, and I&apos;m missing something easy, so that&apos;s why I&apos;m filing this as a &apos;story&apos;.  I see there are some issues open for zfs lustre documentation. Just don&apos;t know what to do, information is scarce.&lt;/p&gt;

&lt;p&gt;Or, if this isn&apos;t supported there&apos;s always &apos;dd&apos;? &lt;/p&gt;

&lt;p&gt;Scott&lt;/p&gt;</description>
                <environment>RHEL6.2</environment>
        <key id="19502">LU-3488</key>
            <summary>zfs mdt backup/recovery via snapshot</summary>
                <type id="6" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11315&amp;avatarType=issuetype">Story</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="6" iconUrl="https://jira.whamcloud.com/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="6">Not a Bug</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="sknolin">Scott Nolin</reporter>
                        <labels>
                    </labels>
                <created>Thu, 20 Jun 2013 20:42:58 +0000</created>
                <updated>Mon, 24 Jun 2013 15:16:38 +0000</updated>
                            <resolved>Mon, 24 Jun 2013 15:16:38 +0000</resolved>
                                    <version>Lustre 2.4.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="60949" author="adilger" created="Thu, 20 Jun 2013 22:28:40 +0000"  >&lt;p&gt;I don&apos;t think &quot;dd&quot; backups would work for ZFS.  The recommended backup method for ZFS (which I now realize is probably not documented in the Lustre manual, and I filed &lt;a href=&quot;https://jira.whamcloud.com/browse/LUDOC-161&quot; title=&quot;document backup/restore process for ZFS backing filesystems&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LUDOC-161&quot;&gt;&lt;del&gt;LUDOC-161&lt;/del&gt;&lt;/a&gt; for this) is to use &quot;&lt;tt&gt;zfs dump&lt;/tt&gt;&quot; and &quot;&lt;tt&gt;zfs restore&lt;/tt&gt;&quot;.&lt;/p&gt;

&lt;p&gt;That said, I think it should be possible to mount a ZFS snapshot, assuming that the primary version of the filesystem is not mounted.&lt;/p&gt;

&lt;p&gt;It might be useful to run &lt;tt&gt;mount.lustre&lt;/tt&gt; directly under gdb or other debugger or via strace to see where the the error is coming from.  It appears that this is failing in &lt;tt&gt;osd_is_lustre-&amp;gt;zfs_is_lustre&lt;/tt&gt;, so either it cannot open the dataset, or the dataset parameters were not copied with the snapshot?  It may be that this relates to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2190&quot; title=&quot;failure on conf-sanity.sh test_49: Different LDLM_TIMEOUT:6 20 20&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2190&quot;&gt;&lt;del&gt;LU-2190&lt;/del&gt;&lt;/a&gt; &lt;a href=&quot;http://review.whamcloud.com/5220&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/5220&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="60953" author="sknolin" created="Fri, 21 Jun 2013 00:00:39 +0000"  >&lt;p&gt;Is &quot;zfs dump&quot; some extra lustre specific utility? I&apos;ve never encountered that one elsewhere with standard zfs, and the zfs command and man page on my lustre 2.4 install don&apos;t indicate it&apos;s an option.&lt;/p&gt;


&lt;p&gt;Scott&lt;/p&gt;</comment>
                            <comment id="61008" author="prakash" created="Fri, 21 Jun 2013 16:34:05 +0000"  >&lt;p&gt;Andreas, perhaps you meant &lt;tt&gt;zfs send&lt;/tt&gt; and &lt;tt&gt;zfs receive&lt;/tt&gt;?&lt;/p&gt;

&lt;p&gt;Scott, can you elaborate a bit more on what you mean by &quot;I am testing recovering a failed metadata server via zfs snapshot&quot;? I&apos;ve run into the issue where the &quot;osd is zfs&quot; check fails before, but I can&apos;t recall offhand what the problem was or how I fixed it. I think I opened a ticket on that, give me a minute to search for it.&lt;/p&gt;

&lt;p&gt;EDIT: This appears to be the ticket I was thinking of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2333&quot; title=&quot;Cannot mount a MGS device backed by ZFS if &amp;quot;--fsname&amp;quot; was not passed to mkfs.lustre&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2333&quot;&gt;&lt;del&gt;LU-2333&lt;/del&gt;&lt;/a&gt;. Make sure the properties are correct on the dataset you are trying to mount.. What does &quot;&lt;tt&gt;sudo zfs get all lustre-meta/meta | grep lustre:&lt;/tt&gt;&quot; say?&lt;/p&gt;</comment>
                            <comment id="61030" author="sknolin" created="Fri, 21 Jun 2013 18:44:54 +0000"  >&lt;p&gt;Prakash,&lt;/p&gt;

&lt;p&gt;I took a snapshot of my zfs backed mgs/mdt (combined device), and used send/receive to save on another server.  I then corrupted my mdt filesystem, rebuilt the filesystem, and attempted to use send/receive with the snapshot to recover it. &quot;service lustre start&quot; fails the mount, I then tried the mount command by hand that I listed above. I&apos;ll list my notes from that below.&lt;/p&gt;

&lt;p&gt;Anway, looking at the properties with &quot;zfs get all&quot; as you suggested does show that the &quot;lustre:&quot; properties are missing. I can try setting those with &quot;zfs set&quot; - the only one I&apos;m not sure of is lustre:flags. My OST&apos;s all have &quot;34&quot;.&lt;/p&gt;

&lt;p&gt;Here are my notes from my test process:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;- Make snapshot

zfs snapshot -r lustre-meta@meta-backup

- send and store entire volume and filesystem (-R) on another zfs server (oss in this case)

zfs send -R lustre-meta@meta-backup | ssh lustre2-8-11 zfs receive lustre-ost0/lustre-meta

- Rebuilt the MDS filesytem. I went to the oss holding the snapshot, and put it in a gzip file, so I could just work with it locally on the mds.

(on oss)
zfs send lustre-ost0/lustre-meta/meta@meta-backup | gzip &amp;gt; /root/meta-snap.gz
 
So to recover (on ost)
 
gunzip -c /root/meta-snap.gz | zfs receive lustre-meta/meta-new@recover

I noticed it builds a mount in /lustre-meta/meta-new that we don&apos;t want

umount /lustre-meta/meta-new
rm -rf /lustre-meta/meta-new

zfs rename lustre-meta/meta lustre-meta/meta-old
zfs rename lustre-meta/meta-new lustre-meta/meta
zfs destroy lustre-meta/meta-old

service lustre stop
service lustre start
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="61032" author="sknolin" created="Fri, 21 Jun 2013 19:01:50 +0000"  >&lt;p&gt;I tried setting the lustre: properties with &apos;zfs set&apos; and it didn&apos;t seem to help. As a WAG I tried flags of &quot;34&quot; and &quot;100&quot; (based on what I saw elsewhere).&lt;/p&gt;

&lt;p&gt;So the zfs get results now show:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lustre-meta/meta  lustre:fsname         cove                   local
lustre-meta/meta  lustre:mgsnode        172.16.24.12@o2ib      local
lustre-meta/meta  lustre:flags          34                     local
lustre-meta/meta  lustre:version        1                      local
lustre-meta/meta  lustre:index          0                      local
lustre-meta/meta  lustre:svname         cove-MDT0000           local
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It&apos;s not important that I recover this filesystem, I just want to work out a procedure. So will gladly test any ideas or confirm what should work (perhaps to help with the documentation effort).&lt;/p&gt;

&lt;p&gt;Scott&lt;/p&gt;</comment>
                            <comment id="61035" author="behlendorf" created="Fri, 21 Jun 2013 20:17:10 +0000"  >&lt;p&gt;Scott,&lt;/p&gt;

&lt;p&gt;You can preserve the properties by adding the -p option to &apos;zfs send&apos;.  Alternately you can use the -R option for zfs send which will create a replication stream and preserve everything.  See the &apos;zfs send&apos; section of the zfs(8) man page.&lt;/p&gt;</comment>
                            <comment id="61036" author="prakash" created="Fri, 21 Jun 2013 20:30:55 +0000"  >&lt;p&gt;Try the &quot;-p&quot; option when sending the backup to your gzip archive. I tried this locally in a VM, and it worked for me.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;zfs send -p lustre-ost0/lustre-meta/meta@meta-backup | gzip &amp;gt; /root/meta-snap.gz
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This will preserve the original data set properties, so you don&apos;t have to set them manually later on. Side note, you also need to do this in the original send of the dataset to backup, but &quot;-R&quot; automatically does this for you.&lt;/p&gt;</comment>
                            <comment id="61041" author="sknolin" created="Fri, 21 Jun 2013 21:21:02 +0000"  >&lt;p&gt;Thank you Brian and Prakash.&lt;/p&gt;

&lt;p&gt;I used &quot;-R&quot; in my initial zfs send, but failed to use that or the -p flag in my send to gzip. &lt;/p&gt;

&lt;p&gt;I simply resent (ssh to the meta this time) with the proper &apos;-p&apos; flag and now all appears well, lustre is recovering.&lt;/p&gt;

&lt;p&gt;I couldn&apos;t get it to mount by setting properties - the &quot;lustre:flags&quot; on the now mounted filesystem is &quot;37&quot;. I wonder if that was the issue, I have no insight into the flags property.&lt;/p&gt;

&lt;p&gt;So this is resolved.&lt;/p&gt;

&lt;p&gt;For what it&apos;s worth, I&apos;ll write up my procedure and add it to a comment on &lt;a href=&quot;https://jira.whamcloud.com/browse/LUDOC-161&quot; title=&quot;document backup/restore process for ZFS backing filesystems&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LUDOC-161&quot;&gt;&lt;del&gt;LUDOC-161&lt;/del&gt;&lt;/a&gt; in case that helps.&lt;/p&gt;

&lt;p&gt;Thanks again,&lt;br/&gt;
Scott&lt;/p&gt;</comment>
                            <comment id="61062" author="adilger" created="Sat, 22 Jun 2013 04:58:11 +0000"  >&lt;p&gt;My bad, yes I meant &quot;zfs send&quot; and &quot;zfs receive&quot;.  Glad you figured out the details with sending the pool parameters. &lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10030" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic/Theme</customfieldname>
                        <customfieldvalues>
                                        <label>zfs</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvtpj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>8770</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>