<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:58:18 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6218] osd-zfs: increase redundancy for OST meta data</title>
                <link>https://jira.whamcloud.com/browse/LU-6218</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;A site had two last_rcvd files corrupted on two OSTs. They were able to truncate the files and the OSTs mounted OK. But I wonder whether we could increase data redundancy for meta data such as the last_rcvd file, to make it harder to corrupt in the first place (or more accurately to make it easier for scrub to repair it should it ever get corrupted).&lt;/p&gt;

&lt;p&gt;The OIs already get two copies of its data blocks as they are ZAPs. But other meta data like last_rcvd get only one copy of the data. The copies property can only be applied at per file system granularity. We can put those files under a separate dataset, e.g. lustre-ost1/ost1/META, and set copies=2 for it. But it&apos;d complicate the code as now there&apos;s two datasets per OST.&lt;/p&gt;</description>
                <environment></environment>
        <key id="28583">LU-6218</key>
            <summary>osd-zfs: increase redundancy for OST meta data</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="isaac">Isaac Huang</assignee>
                                    <reporter username="isaac">Isaac Huang</reporter>
                        <labels>
                            <label>zfs</label>
                    </labels>
                <created>Fri, 6 Feb 2015 06:32:35 +0000</created>
                <updated>Sun, 31 Jul 2016 13:52:48 +0000</updated>
                            <resolved>Thu, 20 Aug 2015 13:18:03 +0000</resolved>
                                                    <fixVersion>Lustre 2.8.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="106209" author="adilger" created="Sun, 8 Feb 2015 09:38:39 +0000"  >&lt;p&gt;Since the OST has direct access to the DMU code, is it possible to somehow flag the dnode at create or write time to generate a ditto block copy only on that inode? I&apos;ve long thought that this is useful for last_rcvd, OI files, config files, etc. just writing two copies of these files at the OSD level wouldn&apos;t actually solve this problem because ZFS wouldn&apos;t know they are copies, and couldn&apos;t automatically repair one on error. &lt;/p&gt;</comment>
                            <comment id="106407" author="isaac" created="Tue, 10 Feb 2015 05:12:21 +0000"  >&lt;p&gt;I just looked at DMU code and it seemed like all we&apos;d need to do is to create these objects with type &lt;em&gt;DMU_OTN_UINT8_METADATA&lt;/em&gt;, instead of &lt;em&gt;DMU_OT_PLAIN_FILE_CONTENTS&lt;/em&gt; (see &lt;em&gt;dmu_write_policy()&lt;/em&gt;).&lt;/p&gt;

&lt;p&gt;At osd-zfs, can I identify such objects by using conditional &lt;em&gt;&quot;fid_seq(fid) == FID_SEQ_LOCAL_FILE&quot;&lt;/em&gt;?&lt;/p&gt;</comment>
                            <comment id="106608" author="adilger" created="Wed, 11 Feb 2015 10:08:19 +0000"  >&lt;p&gt;Would this make these objects inaccessible if mounted directly via ZPL?  That would make last_rcvd, LAST_ID, CATALOGS, other llog files, etc inaccessible for local mounts, which breaks one of the important ZFS compatibility features that we&apos;ve kept to be able to mount the filesystem.  I didn&apos;t see any obvious checks for this in the zpl_read() or zpl_write() code paths, but I do see such a check in &lt;tt&gt;zpl_unlinked_drain()&lt;/tt&gt;:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;                ASSERT((doi.doi_type == DMU_OT_PLAIN_FILE_CONTENTS) ||
                    (doi.doi_type == DMU_OT_DIRECTORY_CONTENTS));
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;but I don&apos;t think we will encounter this in practice since Lustre rarely deletes internal file objects except llog files.&lt;/p&gt;

&lt;p&gt;The DMU_OTN_UINT8_METADATA type definitely shouldn&apos;t be used for FID_SEQ_LOCAL ZAPs, at most it should only be used for regular files.&lt;/p&gt;</comment>
                            <comment id="106739" author="isaac" created="Thu, 12 Feb 2015 01:25:21 +0000"  >&lt;p&gt;Looks like we&apos;d hit that assertion only if:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;A DMU_OTN_UINT8_METADATA object is removed, either by ZPL or osd-zfs (with the upcoming &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5242&quot; title=&quot;Test hang sanity test_132, test_133: umount ost&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5242&quot;&gt;&lt;del&gt;LU-5242&lt;/del&gt;&lt;/a&gt; fix - and I can just directly free such objects in that patch to eliminate this possibility).&lt;/li&gt;
	&lt;li&gt;Before it&apos;s actually freed the system crashed (or ZPL forced umount), so the object stayed in the ZPL delete queue.&lt;/li&gt;
	&lt;li&gt;The dataset is mounted by ZPL (not read-only)&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;I&apos;ll experiment a bit with a simple patch.&lt;/p&gt;</comment>
                            <comment id="106746" author="gerrit" created="Thu, 12 Feb 2015 01:46:55 +0000"  >&lt;p&gt;Isaac Huang (he.huang@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/13741&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13741&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6218&quot; title=&quot;osd-zfs: increase redundancy for OST meta data&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6218&quot;&gt;&lt;del&gt;LU-6218&lt;/del&gt;&lt;/a&gt; osd-zfs: more ditto copies&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 267c12243093e1fd2c92f222a6bac0167986483b&lt;/p&gt;</comment>
                            <comment id="106760" author="isaac" created="Thu, 12 Feb 2015 04:40:04 +0000"  >&lt;p&gt;Everything looked fine. The files showed up in the ZPL name space and I was able to r/w them. And &lt;em&gt;zdb&lt;/em&gt; showed 2 copies of data blocks:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@eagle-44vm1 ost1]# ls -li last_rcvd 
143 -rw-r--r-- 1 root root 8448 Dec 31  1969 last_rcvd
[root@eagle-44vm1 ost1]# zdb -e -dddddd lustre-ost1/ost1 143
......
    Object  lvl   iblk   dblk  dsize  lsize   %full  type
       143    1    16K   128K  9.00K   128K  100.00  uint8 (K=inherit) (Z=inherit)
Indirect blocks:
               0 L0 0:25f93600:1200 0:3005dcc00:1200 20000L/1200P F=1 B=646/646
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then I removed last_rcvd, umount, and mount again - didn&apos;t hit the assertion in zfs_unlinked_drain(). So it was removed from the delete queue and freed before umount. I also tested &lt;em&gt;zfs send/recv&lt;/em&gt; and it worked fine.&lt;/p&gt;</comment>
                            <comment id="106782" author="adilger" created="Thu, 12 Feb 2015 08:03:46 +0000"  >&lt;p&gt;Great.&lt;/p&gt;</comment>
                            <comment id="108804" author="isaac" created="Wed, 4 Mar 2015 21:40:39 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=adilger&quot; class=&quot;user-hover&quot; rel=&quot;adilger&quot;&gt;adilger&lt;/a&gt; Do you happen to know what size the fs_log_size() in test-framewrok.sh returns? I&apos;m wondering whether I should double the size returned for osd-zfs, but couldn&apos;t figure out what size fs_log_size() was actually returning.&lt;/p&gt;</comment>
                            <comment id="110613" author="gerrit" created="Wed, 25 Mar 2015 14:46:34 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/13741/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13741/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6218&quot; title=&quot;osd-zfs: increase redundancy for OST meta data&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6218&quot;&gt;&lt;del&gt;LU-6218&lt;/del&gt;&lt;/a&gt; osd-zfs: increase redundancy for meta data&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: d9e86108724c06e3e6d25081caaf5803abf4416c&lt;/p&gt;</comment>
                            <comment id="110909" author="morrone" created="Fri, 27 Mar 2015 20:15:55 +0000"  >&lt;p&gt;Are there any performance implications from this change?  Performance is already a problem on MDTs.  This redundancy applies there as well, yes?  Is the impact reasonable enough to make this the default there?&lt;/p&gt;</comment>
                            <comment id="120547" author="pjones" created="Tue, 7 Jul 2015 11:37:35 +0000"  >&lt;p&gt;Isaac&lt;/p&gt;

&lt;p&gt;Are you able to answer Chris&apos;s question about performance?&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="120759" author="isaac" created="Wed, 8 Jul 2015 20:32:54 +0000"  >&lt;p&gt;The patch added one additional copy for data blocks of a small number of small files, e.g. last_rcvd. The added overhead is trivial compared to the OIs which already get an additional copy.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzx5t3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>17393</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>