<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:51:55 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-5489] ll_ost thread stuck at  lu_object_find_at</title>
                <link>https://jira.whamcloud.com/browse/LU-5489</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Attached file (service164.gz) has complete trace of all threads.&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;LNet: Service thread pid 8805 was inactive &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; debugging purposes:^M
Pid: 8805, comm: ll_ost03_040^M
^M
Call Trace:^M
 [&amp;lt;ffffffffa04cf6fe&amp;gt;] cfs_waitq_wait+0xe/0x10 [libcfs]^M
 [&amp;lt;ffffffffa062c6b3&amp;gt;] lu_object_find_at+0xb3/0x360 [obdclass]^M
 [&amp;lt;ffffffff81063be0&amp;gt;] ? default_wake_function+0x0/0x20^M
 [&amp;lt;ffffffffa0e74cb9&amp;gt;] ? ofd_key_init+0x59/0x1a0 [ofd]^M
 [&amp;lt;ffffffffa062c976&amp;gt;] lu_object_find+0x16/0x20 [obdclass]^M
 [&amp;lt;ffffffffa0e886c5&amp;gt;] ofd_object_find+0x35/0xf0 [ofd]^M
 [&amp;lt;ffffffffa062d57e&amp;gt;] ? lu_env_init+0x1e/0x30 [obdclass]^M
 [&amp;lt;ffffffffa0e98649&amp;gt;] ofd_lvbo_update+0x6d9/0xea8 [ofd]^M
 [&amp;lt;ffffffffa0e7df77&amp;gt;] ofd_setattr+0x7e7/0xb80 [ofd]^M
 [&amp;lt;ffffffffa0e4ec1c&amp;gt;] ost_setattr+0x31c/0x990 [ost]^M
 [&amp;lt;ffffffffa0e52746&amp;gt;] ost_handle+0x21e6/0x48e0 [ost]^M
 [&amp;lt;ffffffffa04db124&amp;gt;] ? libcfs_id2str+0x74/0xb0 [libcfs]^M
 [&amp;lt;ffffffffa07c53b8&amp;gt;] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]^M
 [&amp;lt;ffffffffa04cf5de&amp;gt;] ? cfs_timer_arm+0xe/0x10 [libcfs]^M
 [&amp;lt;ffffffffa04e0d6f&amp;gt;] ? lc_watchdog_touch+0x6f/0x170 [libcfs]^M
 [&amp;lt;ffffffffa07bc719&amp;gt;] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]^M
 [&amp;lt;ffffffff81063be0&amp;gt;] ? default_wake_function+0x0/0x20^M
 [&amp;lt;ffffffffa07c674e&amp;gt;] ptlrpc_main+0xace/0x1700 [ptlrpc]^M
 [&amp;lt;ffffffffa07c5c80&amp;gt;] ? ptlrpc_main+0x0/0x1700 [ptlrpc]^M
 [&amp;lt;ffffffff8100c0ca&amp;gt;] child_rip+0xa/0x20^M
 [&amp;lt;ffffffffa07c5c80&amp;gt;] ? ptlrpc_main+0x0/0x1700 [ptlrpc]^M
 [&amp;lt;ffffffffa07c5c80&amp;gt;] ? ptlrpc_main+0x0/0x1700 [ptlrpc]^M
 [&amp;lt;ffffffff8100c0c0&amp;gt;] ? child_rip+0x0/0x20^M
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>source at &lt;a href=&quot;https://github.com/jlan/lustre-nas&quot;&gt;https://github.com/jlan/lustre-nas&lt;/a&gt;&lt;br/&gt;
running version 2.4.3-5.1nas</environment>
        <key id="26016">LU-5489</key>
            <summary>ll_ost thread stuck at  lu_object_find_at</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="hongchao.zhang">Hongchao Zhang</assignee>
                                    <reporter username="mhanafi">Mahmoud Hanafi</reporter>
                        <labels>
                    </labels>
                <created>Thu, 14 Aug 2014 18:03:07 +0000</created>
                <updated>Thu, 2 Oct 2014 21:05:59 +0000</updated>
                            <resolved>Wed, 3 Sep 2014 14:34:02 +0000</resolved>
                                    <version>Lustre 2.4.3</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="91650" author="green" created="Thu, 14 Aug 2014 19:42:00 +0000"  >&lt;p&gt;I guess this is similar in nature to lu4725 only this time in ofd code. This should be a somewhat rare race.&lt;/p&gt;

&lt;p&gt;ofd_setattr does ofd_object_find and pins and object. Then some other thread destroys and object and then ldlm_res_lvbo_update-&amp;gt;ofd_lvbo_update() does ofd_object_find, finds the now destroyed object and starts to wait till the referenes go away, but they cannot because it&apos;s this same thread that&apos;s holding the reference.&lt;/p&gt;

&lt;p&gt;Technically the object should not ever be deleted because we are supposed to hold an ldlm lock on it, but I imagine if the lock was somehow lost (held by a client, so if a client was evicted for example) - this might happen.&lt;/p&gt;

&lt;p&gt;We need to add some sort of a non-racy check to make sure the object is till alive before going into lvbo update for a fix.&lt;/p&gt;

&lt;p&gt;Looking in the logs we can see there was a bunch of evictions indeed, sso this indeed is a plausible scenario.&lt;/p&gt;</comment>
                            <comment id="91790" author="pjones" created="Fri, 15 Aug 2014 22:57:27 +0000"  >&lt;p&gt;Hongchao&lt;/p&gt;

&lt;p&gt;Could you please look into the feasibility of reworking this code in the manner Oleg suggests?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="92057" author="green" created="Wed, 20 Aug 2014 15:40:48 +0000"  >&lt;p&gt;After some additional digging I found that this is actually a dup of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4019&quot; title=&quot;today&amp;#39;s master stick on shutdown on test == sanity test 132: on lu_object_find_at&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4019&quot;&gt;&lt;del&gt;LU-4019&lt;/del&gt;&lt;/a&gt;, the patch from there should help you too, I verified that it applies to b2_4: &lt;a href=&quot;http://review.whamcloud.com/7795&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7795&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="15533" name="service164.gz" size="585782" author="mhanafi" created="Thu, 14 Aug 2014 18:03:07 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwtrj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>15312</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>