<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:02:10 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6664] (ost_handler.c:1765:ost_blocking_ast()) Error -2 syncing data on lock cancel</title>
                <link>https://jira.whamcloud.com/browse/LU-6664</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;On all of our filesystems, the following error message is extremely common:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 8746:0:(ost_handler.c:1776:ost_blocking_ast()) Error -2 syncing data on lock cancel
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There is nothing else in the logs that gives any hint as to why this message is appearing.&lt;/p&gt;

&lt;p&gt;Our filesystems all use osd-zfs, and we are currently running Lustre 2.5.3-5chaos (see github.com/chaos/lustre).&lt;/p&gt;

&lt;p&gt;If this is a symptom of a bug, then please fix it.  If this is not a symptom of a bug, then please stop scaring our system administrators with this message.&lt;/p&gt;</description>
                <environment></environment>
        <key id="30429">LU-6664</key>
            <summary>(ost_handler.c:1765:ost_blocking_ast()) Error -2 syncing data on lock cancel</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="morrone">Christopher Morrone</reporter>
                        <labels>
                            <label>llnl</label>
                    </labels>
                <created>Fri, 29 May 2015 19:20:16 +0000</created>
                <updated>Wed, 27 Jul 2016 18:54:19 +0000</updated>
                            <resolved>Mon, 31 Aug 2015 21:14:46 +0000</resolved>
                                    <version>Lustre 2.5.3</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>10</watches>
                                                                            <comments>
                            <comment id="116914" author="pjones" created="Fri, 29 May 2015 20:42:26 +0000"  >&lt;p&gt;Bobijam&lt;/p&gt;

&lt;p&gt;Could you please look into this issue?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="116922" author="green" created="Fri, 29 May 2015 21:10:03 +0000"  >&lt;p&gt;Bobi: This message is ENOENT when we are supposedly trying to flush data on lock cancel. But technically if we have a lock, the object should be there (the only exception I can think of is the actual object destroy is happening under the lock, so at the end of destroy the lock is still there and the object is not, but then there should be nothing to flush).&lt;br/&gt;
So can you please examine server side code for lock cancel to see if there are any possible races that could lead to this message.&lt;/p&gt;</comment>
                            <comment id="117065" author="bobijam" created="Mon, 1 Jun 2015 16:35:03 +0000"  >&lt;p&gt;Chris,&lt;/p&gt;

&lt;p&gt;Does your system just undergo recovery before these messages appears?&lt;/p&gt;</comment>
                            <comment id="117089" author="morrone" created="Mon, 1 Jun 2015 18:34:09 +0000"  >&lt;p&gt;No, it has not just undergone recovery.  There are no other messages in the logs surrounding these ost_blocking_ast() messages.&lt;/p&gt;</comment>
                            <comment id="117665" author="bobijam" created="Sat, 6 Jun 2015 04:17:45 +0000"  >&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/15167&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/15167&lt;/a&gt;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedHeader panelHeader&quot; style=&quot;border-bottom-width: 1px;&quot;&gt;&lt;b&gt;commit message&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LU-6664 ofd: LDLM lock should cover object destroy 

The exclusive PW lock protecting OST object destroy should be 
released after object destroy procedure. 

Quench error messages of object unavailability when trying to cancel 
a LDLM lock. 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="117868" author="green" created="Tue, 9 Jun 2015 04:16:07 +0000"  >&lt;p&gt;Just as I was reviewing this patch (the comments are n the patch), I just remembered that LLNL had suspicions of double referenced objects in the past (&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5648&quot; title=&quot;corrupt files contain extra data&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5648&quot;&gt;&lt;del&gt;LU-5648&lt;/del&gt;&lt;/a&gt;) where same object was potentially referenced twice (that was never confirmed, though).&lt;br/&gt;
So having objects owned by two files would most likely lead to this message too.&lt;/p&gt;

&lt;p&gt;I wonder if lfsck in 2.5.4 is already in a good enough shape to be able to detect that.&lt;/p&gt;</comment>
                            <comment id="120485" author="adilger" created="Mon, 6 Jul 2015 20:55:09 +0000"  >&lt;p&gt;The LFSCK in 2.5.x does not check MDT&amp;lt;-&amp;gt;OST consistency.  That feature (&quot;lctl lfsck_start -t layout&quot;) wasn&apos;t added until 2.6.0.&lt;/p&gt;</comment>
                            <comment id="124166" author="adilger" created="Fri, 14 Aug 2015 17:08:52 +0000"  >&lt;p&gt;Bobijam, can you please also make a version of your patch for master.&lt;/p&gt;</comment>
                            <comment id="124224" author="gerrit" created="Sat, 15 Aug 2015 02:32:31 +0000"  >&lt;p&gt;Bobi Jam (bobijam@hotmail.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/15997&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/15997&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6664&quot; title=&quot;(ost_handler.c:1765:ost_blocking_ast()) Error -2 syncing data on lock cancel&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6664&quot;&gt;&lt;del&gt;LU-6664&lt;/del&gt;&lt;/a&gt; ofd: LDLM lock should cover object destroy&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 4575d04887bfd2a78a5a0340841d2da6ef23c165&lt;/p&gt;</comment>
                            <comment id="124815" author="adilger" created="Fri, 21 Aug 2015 18:00:40 +0000"  >&lt;p&gt;Oleg and I looked into this issue more closely, and the current patch doesn&apos;t really solve the problem, since the race is when the two destroy threads are getting and dropping the DLM lock, and not when the actual destroy is happening.  In master, the equivalent function &lt;tt&gt;tgt_blocking_ast()&lt;/tt&gt; already has a check for &lt;tt&gt;dt_object_exists()&lt;/tt&gt; and skips the call into &lt;tt&gt;ofd_sync()&lt;/tt&gt; that generates this message completely.&lt;/p&gt;

&lt;p&gt;I think the right fix (for 2.5.x only) is to just skip this message for &lt;tt&gt;rc == -ENOENT&lt;/tt&gt; as is already done in master.&lt;/p&gt;</comment>
                            <comment id="124904" author="ahkumar" created="Mon, 24 Aug 2015 16:37:33 +0000"  >&lt;p&gt;I too see ton&apos;s of these error messages: Any help in resolving them will be very helpful. I can provide any debug logs if required. It is consistently appearing in most of the OSS&apos;s. &lt;/p&gt;

&lt;p&gt;LustreError: 7577:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel&lt;br/&gt;
LustreError: 25058:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel&lt;br/&gt;
LustreError: 1634:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel&lt;br/&gt;
LustreError: 1634:0:(ost_handler.c:1764:ost_blocking_ast()) Skipped 1 previous similar message&lt;br/&gt;
LustreError: 25058:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel&lt;br/&gt;
LustreError: 25058:0:(ost_handler.c:1764:ost_blocking_ast()) Skipped 2 previous similar messages&lt;br/&gt;
LustreError: 33552:0:(ost_handler.c:1764:ost_blocking_ast()) Error -2 syncing data on lock cancel&lt;/p&gt;

&lt;p&gt;Thank you,&lt;br/&gt;
Amit&lt;/p&gt;</comment>
                            <comment id="125801" author="pjones" created="Mon, 31 Aug 2015 21:14:46 +0000"  >&lt;p&gt;As per LLNL ok to close&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="31485">LU-7007</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="27313">LU-5805</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="32670">LU-7308</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxein:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>