<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:10:11 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-763] DIO write does not force sync journal commit on OST</title>
                <link>https://jira.whamcloud.com/browse/LU-763</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;using sgp_dd to write to a Lustre file with async journal commits enabled, the JBD history shows that journal entries are being committed only at the timeout interval, and not synchronously on each write.  This will cause data loss if the OST crashes after the data is written but before the journal is flushed.  &lt;/p&gt;</description>
                <environment></environment>
        <key id="12128">LU-763</key>
            <summary>DIO write does not force sync journal commit on OST</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="green">Oleg Drokin</assignee>
                                    <reporter username="eeb">Eric Barton</reporter>
                        <labels>
                    </labels>
                <created>Fri, 14 Oct 2011 10:53:49 +0000</created>
                <updated>Mon, 29 May 2017 02:44:54 +0000</updated>
                            <resolved>Mon, 29 May 2017 02:44:54 +0000</resolved>
                                    <version>Lustre 1.8.6</version>
                                                        <due></due>
                            <votes>1</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="21315" author="pjones" created="Fri, 14 Oct 2011 11:06:52 +0000"  >&lt;p&gt;Oleg&lt;/p&gt;

&lt;p&gt;can you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="21317" author="adilger" created="Fri, 14 Oct 2011 11:14:24 +0000"  >&lt;p&gt;Eric, the clients will do bulk RPC recovery if the io does not commit, because the transno for the RPC is not reported as committed yet. &lt;/p&gt;

&lt;p&gt;This is needed for ZFS also, which would suffer terribly if it had to do sync writes w/o the ZIL. &lt;/p&gt;</comment>
                            <comment id="21321" author="johann" created="Fri, 14 Oct 2011 12:20:03 +0000"  >&lt;p&gt;hm, weird, OBD_BRW_ASYNC is not set for dios, so we should really be triggering a journal flush &amp;amp; wait.&lt;/p&gt;</comment>
                            <comment id="21323" author="eeb" created="Fri, 14 Oct 2011 12:39:10 +0000"  >&lt;p&gt;Sorry my bad, but forgot to say the whole point of this bug which is that sgp_dd had &quot;dio=1&quot; set - i.e. the client was doing O_DIRECT writes which should force sync journal writes since the client cannot replay such writes.&lt;/p&gt;</comment>
                            <comment id="21330" author="green" created="Fri, 14 Oct 2011 20:26:43 +0000"  >&lt;p&gt;I just performed a number of tests and I don&apos;t think I can reproduce this at all. Tried on 1.8.6-wc and 1.8.7-rc1&lt;/p&gt;

&lt;p&gt;Here is my testcase:&lt;br/&gt;
mount lustre, echo 0 to /proc/fs/lustre/obdfilter/*/sync_journal&lt;br/&gt;
dd if=/dev/zero of=/mnt/lustre/file bs=1024k count=100 oflags=direct # this is directio write&lt;br/&gt;
dd if=/dev/zero of=/mnt/lustre/file1 bs=1024k count=100 # this is normal write&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;both writes happened to be to the same ost, now check jornal history and we see first write had a bunch of small transactions for directio write followed by just a single transaction for the non-directio write.&lt;br/&gt;
R/C  tid   wait  run   lock  flush log   hndls  block inlog ctime write drop  close&lt;br/&gt;
R    21    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    22    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    23    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    24    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    25    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    26    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    27    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    28    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    29    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    30    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    31    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    32    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    33    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    34    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    35    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    36    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    37    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    38    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    39    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    40    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    41    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    42    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    43    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    44    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    45    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    46    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    47    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    48    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    49    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    50    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    51    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    52    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    53    0     0     0     0     10    1      4     5    &lt;br/&gt;
R    54    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    55    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    56    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    57    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    58    0     0     0     0     10    1      4     5    &lt;br/&gt;
R    59    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    60    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    61    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    62    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    63    0     0     0     0     100   1      4     5    &lt;br/&gt;
R    64    0     0     0     0     10    1      4     5    &lt;br/&gt;
R    65    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    66    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    67    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    68    0     0     0     0     10    1      4     5    &lt;br/&gt;
R    69    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    70    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    71    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    72    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    73    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    74    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    75    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    76    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    77    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    78    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    79    0     0     0     0     10    1      4     5    &lt;br/&gt;
R    80    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    81    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    82    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    83    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    84    0     0     0     0     10    1      4     5    &lt;br/&gt;
R    85    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    86    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    87    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    88    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    89    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    90    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    91    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    92    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    93    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    94    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    95    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    96    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    97    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    98    0     0     0     0     0     1      4     5    &lt;br/&gt;
R    99    0     10    0     0     0     1      4     5    &lt;br/&gt;
R    100   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    101   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    102   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    103   0     10    0     0     0     1      4     5    &lt;br/&gt;
R    104   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    105   0     10    0     0     0     1      4     5    &lt;br/&gt;
R    106   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    107   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    108   0     0     0     0     10    1      4     5    &lt;br/&gt;
R    109   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    110   0     10    0     0     0     1      4     5    &lt;br/&gt;
R    111   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    112   0     10    0     0     0     1      4     5    &lt;br/&gt;
R    113   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    114   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    115   0     10    0     0     0     1      4     5    &lt;br/&gt;
R    116   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    117   0     0     0     0     0     1      4     5    &lt;br/&gt;
R    118   0     10    0     0     0     1      4     5    &lt;br/&gt;
R    119   0     5010  0     0     10    117    6     7    &lt;/li&gt;
&lt;/ol&gt;


</comment>
                            <comment id="21335" author="eeb" created="Sun, 16 Oct 2011 07:59:57 +0000"  >&lt;p&gt;Shame there aren&apos;t timestamps in the transaction log to prove that the non-dio I/O all happened after the last small transaction.&lt;/p&gt;

&lt;p&gt;The program doing the writing when we observed DIO writes not doing sync journal commits was sgp_dd running as follows...&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;sgp_dd time=1 bs=512 bpt=2048 thr=16 dio=1 of=/mnt/lustre/ost0/tf2.out if=/dev/zero count=8192000 &amp;amp;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;...and the file was in a directory where the default layout was 1 stripe on ost0.  I can&apos;t imagine why any of that would make a difference, but I don&apos;t think we can rule out that it doesn&apos;t.&lt;/p&gt;

&lt;p&gt;For the full evidence, look in the commitT-OUT5sec.tar.bz2 attached to NTAP-3.  The transaction history is printed in writeJBDstats.before.out and writeJBDstats.after.out, and you can compare the 2 to see what occurred during the test run.  See runWrite8OSTs.ksh for what got run.&lt;/p&gt;

&lt;p&gt;&amp;lt;Note - edited this comment because commitT-OUT10sec.tar.bz2 didn&apos;t set dio=1 in the sgp_dd commands run&amp;gt;&lt;/p&gt;</comment>
                            <comment id="21338" author="green" created="Sun, 16 Oct 2011 13:40:17 +0000"  >&lt;p&gt;I just checked sgp_dd man page here: &lt;a href=&quot;http://linux.die.net/man/8/sgp_dd&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://linux.die.net/man/8/sgp_dd&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The man page tells me that sgp_dd defaults to non-direct io and if you want direct io you must pass in dio=1 flag that seems to be absent in the command line provided.&lt;/p&gt;

&lt;p&gt;Regarding the timestamp in transaction history, you can fully trust me that I did check the history in between the dd commands (and I did a sync before too) and at the completion of directio dd there were only short transactions in the log, the long one was added after the normal io dd was run.&lt;/p&gt;</comment>
                            <comment id="197338" author="adilger" created="Mon, 29 May 2017 02:44:54 +0000"  >&lt;p&gt;Close old ticket.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw2cn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10463</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>