<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:16:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1466] Hyperion DAT - IOR ssf - client eviction</title>
                <link>https://jira.whamcloud.com/browse/LU-1466</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Running IOR, single-shared file, a single client is always evicted for a blocking callback by OSS. OSS and client debug logs attached.&lt;/p&gt;</description>
                <environment></environment>
        <key id="14694">LU-1466</key>
            <summary>Hyperion DAT - IOR ssf - client eviction</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="green">Oleg Drokin</assignee>
                                    <reporter username="cliffw">Cliff White</reporter>
                        <labels>
                    </labels>
                <created>Fri, 1 Jun 2012 15:41:31 +0000</created>
                <updated>Fri, 22 Jun 2012 10:04:13 +0000</updated>
                            <resolved>Fri, 22 Jun 2012 10:04:13 +0000</resolved>
                                    <version>Lustre 2.1.2</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="39844" author="cliffw" created="Fri, 1 Jun 2012 15:44:33 +0000"  >&lt;p&gt;Debug logs are on FTP, uploads&lt;br/&gt;
 dat.fail.client617.gz&lt;br/&gt;
 dat.fail.oss32.log.gz&lt;/p&gt;</comment>
                            <comment id="39847" author="cliffw" created="Fri, 1 Jun 2012 16:19:32 +0000"  >&lt;p&gt;Repeated tests, repeated error, full debug and msgs uploaded to uploads&lt;br/&gt;
oss.dit29.debug.log.gz  oss.dit29.msg.gz  hy34.debug.log.gz  hy34.msg.gz&lt;/p&gt;</comment>
                            <comment id="39861" author="pjones" created="Sat, 2 Jun 2012 02:25:07 +0000"  >&lt;p&gt;Oleg&lt;/p&gt;

&lt;p&gt;What do you advise here?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="39866" author="cliffw" created="Sat, 2 Jun 2012 17:20:22 +0000"  >&lt;p&gt;I have repeated the test with the 2.2.54 tag, same errors.&lt;/p&gt;</comment>
                            <comment id="39955" author="green" created="Mon, 4 Jun 2012 17:33:44 +0000"  >&lt;p&gt;the second test log set contains the culprit:&lt;br/&gt;
The client got the cancel request:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00010000:00010000:1.0:1338581263.050526:0:22810:0:(ldlm_lockd.c:1503:ldlm_handle_bl_callback()) ### client blocking AST callback handler ns: lustre-OST002a-osc-ffff8101a4928800 lock: ffff8101c7918b40/0xc2c2e1a2df0fb82d lrc: 3/0,0 mode: PW/PW res: 36588/0 rrc: 6 type: EXT [4706009088-&amp;gt;4739563519] (req 4706009088-&amp;gt;4707057663) flags: 0x10100000 remote: 0x1272af3896302f40 expref: -99 pid: 25237 timeout 0
00010000:00010000:1.0:1338581263.050531:0:22810:0:(ldlm_lockd.c:1516:ldlm_handle_bl_callback()) Lock ffff8101c7918b40 already unused, calling callback (ffffffff88837f80)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Now I assume the bl callback just blocked on the cl_lock_mutex_get(env, lock); in osc_dlm_blocking_ast0() as there is basically nothing else to block on.&lt;/p&gt;

&lt;p&gt;Then there&apos;s no activity with this lock until finally the lock is cancelled:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00010000:00010000:1.0:1338581374.583277:0:22810:0:(ldlm_request.c:1030:ldlm_cli_cancel_local()) ### client-side cancel ns: lustre-OST002a-osc-ffff8101a4928800 lock: ffff8101c7918b40/0xc2c2e1a2df0fb82d lrc: 4/0,0 mode: PW/PW res: 36588/0 rrc: 6 type: EXT [4706009088-&amp;gt;4739563519] (req 4706009088-&amp;gt;4707057663) flags: 0x10102010 remote: 0x1272af3896302f40 expref: -99 pid: 25237 timeout 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Now this only happens after the client was already evicted, in fact - right after the eviction and cancellation of a bunch of RPCs, but none of them seems to be too old, all of them certainly have appeared way after we got the BL ast.&lt;br/&gt;
But something was certainly holding the lock mutex and I am not really sure what might have been it.&lt;/p&gt;

&lt;p&gt;This seems to be at least marginally related to lu1274 which was dealing with a similar issue in glimpse callback and at the time it was decided that it was the server slowly progressing through IO, but in fact it might be the client that hogs the lock after all.&lt;/p&gt;</comment>
                            <comment id="39966" author="jay" created="Mon, 4 Jun 2012 20:45:39 +0000"  >&lt;p&gt;The cancel process is not blocked at acquiring a lock mutex because I saw this line in the same log:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000020:00010000:1.0:1338581263.050536:0:22810:0:(cl_lock.c:143:cl_lock_trace0()) cancel lock: ffff810137d5f4b0@(1 ffff810228bc4080 1 5 0 0 0 0)(ffff81016091ea30/1/1) at cl_lock_cancel():1830
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;this means it has already grabbed lock mutex to call cl_lock_cancel().&lt;/p&gt;

&lt;p&gt;However, I don&apos;t know why it can;t go through the canceling process. Maybe it was blocked at a page based on the situation that there is no RPC sent at all. Can you please show me the backtrace with a higher level debug information?&lt;/p&gt;</comment>
                            <comment id="40020" author="cliffw" created="Tue, 5 Jun 2012 11:08:11 +0000"  >&lt;p&gt;That is likely to be difficult short term, may be possible after next test cycle on Hyperion.&lt;/p&gt;</comment>
                            <comment id="41032" author="cliffw" created="Fri, 22 Jun 2012 10:04:13 +0000"  >&lt;p&gt;Current testing with 2.1.2 fails to reproduce this issue. &lt;a href=&quot;https://maloo.whamcloud.com/test_sets/652c72e0-b9b5-11e1-9392-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/test_sets/652c72e0-b9b5-11e1-9392-52540035b04c&lt;/a&gt; Closing.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvgwn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6388</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>