<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:57:05 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6085] racer stuck on mutex_lock in ll_setattr_raw()</title>
                <link>https://jira.whamcloud.com/browse/LU-6085</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;With stack trace of:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;chmod         D 0000000000000000     0 25015      1 0x00000000
 ffff880175afdca8 0000000000000086 ffff880175afdc88 ffffffffa077c842
 ffff880175afdc28 ffff880182b3d400 ffffffff8100b9ce ffff880175afdca8
 ffff88018162baf8 ffff880175afdfd8 000000000000fb88 ffff88018162baf8
Call Trace:
 [&amp;lt;ffffffffa077c842&amp;gt;] ? __req_capsule_get+0x162/0x6d0 [ptlrpc]
 [&amp;lt;ffffffff8100b9ce&amp;gt;] ? common_interrupt+0xe/0x13
 [&amp;lt;ffffffff810521eb&amp;gt;] ? mutex_spin_on_owner+0x9b/0xc0
 [&amp;lt;ffffffff8150fc5e&amp;gt;] __mutex_lock_slowpath+0x13e/0x180
 [&amp;lt;ffffffff8150fafb&amp;gt;] mutex_lock+0x2b/0x50
 [&amp;lt;ffffffffa0e92e5c&amp;gt;] ll_setattr_raw+0x58c/0x1ae0 [lustre]
 [&amp;lt;ffffffff81192a72&amp;gt;] ? user_path_at+0x62/0xa0
 [&amp;lt;ffffffffa0e94415&amp;gt;] ll_setattr+0x65/0xd0 [lustre]
 [&amp;lt;ffffffff8119ead8&amp;gt;] notify_change+0x168/0x340
 [&amp;lt;ffffffff8117ee13&amp;gt;] sys_fchmodat+0xc3/0x100
 [&amp;lt;ffffffff81186fc6&amp;gt;] ? sys_newstat+0x36/0x50
 [&amp;lt;ffffffff8151171e&amp;gt;] ? do_device_not_available+0xe/0x10
 [&amp;lt;ffffffff8100b072&amp;gt;] system_call_fastpath+0x16/0x1b
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It turned out that the inode mutex is already held by the current thread itself. The root cause of this issue is in function ll_md_setattr() where it calls simple_setattr() even setting attribute on the MDT fails:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;                ptlrpc_req_finished(request);
                &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (rc == -ENOENT) {
                        clear_nlink(inode);
                        /* Unlinked special device node? Or just a race?
                         * Pretend we done everything. */
                        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (!S_ISREG(inode-&amp;gt;i_mode) &amp;amp;&amp;amp;
                            !S_ISDIR(inode-&amp;gt;i_mode)) {
                                ia_valid = op_data-&amp;gt;op_attr.ia_valid;
                                op_data-&amp;gt;op_attr.ia_valid &amp;amp;= ~TIMES_SET_FLAGS;
                                rc = simple_setattr(dentry, &amp;amp;op_data-&amp;gt;op_attr);
                                op_data-&amp;gt;op_attr.ia_valid = ia_valid;
                        }
                } &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (rc != -EPERM &amp;amp;&amp;amp; rc != -EACCES &amp;amp;&amp;amp; rc != -ETXTBSY) {
                        CERROR(&lt;span class=&quot;code-quote&quot;&gt;&quot;md_setattr fails: rc = %d\n&quot;&lt;/span&gt;, rc);
                }
                RETURN(rc);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In racer, it may try to change a SOCK file to a regular file which will definitely fail. If that file happens to have been deleted, it will call simple_setattr() because it encounters ENOENT error, then the file&apos;s mode will be changed to regular file and then causes mutex_lock stuck.&lt;/p&gt;

&lt;p&gt;I will push a patch to fix this issue.&lt;/p&gt;</description>
                <environment></environment>
        <key id="28061">LU-6085</key>
            <summary>racer stuck on mutex_lock in ll_setattr_raw()</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="jay">Jinshan Xiong</assignee>
                                    <reporter username="jay">Jinshan Xiong</reporter>
                        <labels>
                            <label>mq115</label>
                    </labels>
                <created>Wed, 7 Jan 2015 00:22:27 +0000</created>
                <updated>Fri, 16 Jan 2015 18:11:07 +0000</updated>
                            <resolved>Fri, 16 Jan 2015 18:11:07 +0000</resolved>
                                                    <fixVersion>Lustre 2.7.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="102709" author="jay" created="Wed, 7 Jan 2015 01:57:45 +0000"  >&lt;p&gt;This issue is more complex than I thought. After I made and applied a patch on the client side, I found that MDT actually returns different file type in the setattr reply. I patched my client code as follows:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;diff --git a/lustre/llite/llite_lib.c b/lustre/llite/llite_lib.c
index ee14f15..81d9906 100644
--- a/lustre/llite/llite_lib.c
+++ b/lustre/llite/llite_lib.c
@@ -1498,15 +1498,18 @@ &lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; ll_md_setattr(struct dentry *dentry, struct md_op_data *
                ptlrpc_req_finished(request);
                &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (rc == -ENOENT) {
                        clear_nlink(inode);
+#&lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; 0
                        /* Unlinked special device node? Or just a race?
                         * Pretend we done everything. */
                        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (!S_ISREG(inode-&amp;gt;i_mode) &amp;amp;&amp;amp;
                            !S_ISDIR(inode-&amp;gt;i_mode)) {
                                ia_valid = op_data-&amp;gt;op_attr.ia_valid;
                                op_data-&amp;gt;op_attr.ia_valid &amp;amp;= ~TIMES_SET_FLAGS;
+                               op_data-&amp;gt;op_attr.ia_valid &amp;amp;= ~ATTR_MODE;
                                rc = simple_setattr(dentry, &amp;amp;op_data-&amp;gt;op_attr);
                                op_data-&amp;gt;op_attr.ia_valid = ia_valid;
                        }
+#endif
                } &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (rc != -EPERM &amp;amp;&amp;amp; rc != -EACCES &amp;amp;&amp;amp; rc != -ETXTBSY) {
                        CERROR(&lt;span class=&quot;code-quote&quot;&gt;&quot;md_setattr fails: rc = %d\n&quot;&lt;/span&gt;, rc);
                }
@@ -1520,6 +1523,14 @@ &lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; ll_md_setattr(struct dentry *dentry, struct md_op_data *o
                 RETURN(rc);
         }
 
+        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (md.body-&amp;gt;mbo_valid &amp;amp; OBD_MD_FLTYPE)
+                LASSERTF((inode-&amp;gt;i_mode &amp;amp; S_IFMT) == (md.body-&amp;gt;mbo_mode &amp;amp; S_IFMT),
+                         &lt;span class=&quot;code-quote&quot;&gt;&quot;mode changed: %o -&amp;gt; %o, ia_valid = %x, mode = %o,&quot;&lt;/span&gt;
+                        &lt;span class=&quot;code-quote&quot;&gt;&quot; FID = &quot;&lt;/span&gt;DFID&lt;span class=&quot;code-quote&quot;&gt;&quot;/&quot;&lt;/span&gt;DFID&lt;span class=&quot;code-quote&quot;&gt;&quot; \n&quot;&lt;/span&gt;,
+                        inode-&amp;gt;i_mode &amp;amp; S_IFMT, md.body-&amp;gt;mbo_mode &amp;amp; S_IFMT,
+                        ia_valid, op_data-&amp;gt;op_attr.ia_mode,
+                        PFID(ll_inode2fid(inode)), PFID(&amp;amp;md.body-&amp;gt;mbo_fid1));
+
        ia_valid = op_data-&amp;gt;op_attr.ia_valid;
        /* inode size will be in ll_setattr_ost, can&apos;t &lt;span class=&quot;code-keyword&quot;&gt;do&lt;/span&gt; it now since dirty
         * cache is not cleared yet. */
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and this is what I got:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 26785:0:(llite_lib.c:1543:ll_md_setattr()) ASSERTION( (inode-&amp;gt;i_mode &amp;amp; S_IFMT) == (md.body-&amp;gt;mbo_mode &amp;amp; S_IFMT) ) failed: mode changed: 40000 -&amp;gt; 100000, ia_valid = 10000046, mode = 0, FID = [0x200000403:0x8f5:0x0]/[0x200000403:0x8f5:0x0] 
LustreError: 26785:0:(llite_lib.c:1543:ll_md_setattr()) LBUG
Pid: 26785, comm: chown

Call Trace:
 [&amp;lt;ffffffffa0483895&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [&amp;lt;ffffffffa0483e97&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
 [&amp;lt;ffffffffa0eedc04&amp;gt;] ll_setattr_raw+0x1304/0x1c60 [lustre]
 [&amp;lt;ffffffffa0eee5c5&amp;gt;] ll_setattr+0x65/0xd0 [lustre]
 [&amp;lt;ffffffff8119ead8&amp;gt;] notify_change+0x168/0x340
 [&amp;lt;ffffffff81192a72&amp;gt;] ? user_path_at+0x62/0xa0
 [&amp;lt;ffffffff8117e94e&amp;gt;] chown_common+0x6e/0x90
 [&amp;lt;ffffffff8117ec96&amp;gt;] sys_fchownat+0x96/0xb0
 [&amp;lt;ffffffff8100b072&amp;gt;] system_call_fastpath+0x16/0x1b
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It clearly shows that client and MDT had a different idea about the type of the file. This needs further investigation.&lt;/p&gt;</comment>
                            <comment id="103170" author="gerrit" created="Mon, 12 Jan 2015 07:52:35 +0000"  >&lt;p&gt;Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/13344&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13344&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6085&quot; title=&quot;racer stuck on mutex_lock in ll_setattr_raw()&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6085&quot;&gt;&lt;del&gt;LU-6085&lt;/del&gt;&lt;/a&gt; mdt: return valid attribute only to client&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 39ea3db98654d09b926c3166afcf42c5076608d2&lt;/p&gt;</comment>
                            <comment id="103182" author="vinayak_clogeny" created="Mon, 12 Jan 2015 14:27:28 +0000"  >&lt;p&gt;Hi Jinshan,&lt;/p&gt;

&lt;p&gt;I did not understand how the patch &lt;a href=&quot;http://review.whamcloud.com/13344&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13344&lt;/a&gt; will resolve this issue.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Call Trace:
 [&amp;lt;ffffffffa077c842&amp;gt;] ? __req_capsule_get+0x162/0x6d0 [ptlrpc]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If I am not wrong the fix you introduced will come into picture only after getting the reply but as call trace tells that the client is stuck on waiting for the request.&lt;/p&gt;

&lt;p&gt;Please give me some information or correct me if I am wrong.&lt;/p&gt;</comment>
                            <comment id="103659" author="jay" created="Thu, 15 Jan 2015 20:41:48 +0000"  >&lt;p&gt;I think the patch is correct. The call trace shows that the process was waiting for inode mutex instead of an RPC request to finish.&lt;/p&gt;</comment>
                            <comment id="103712" author="gerrit" created="Fri, 16 Jan 2015 03:26:16 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/13344/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13344/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-6085&quot; title=&quot;racer stuck on mutex_lock in ll_setattr_raw()&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-6085&quot;&gt;&lt;del&gt;LU-6085&lt;/del&gt;&lt;/a&gt; mdt: return valid attribute only to client&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 0e638c2fb9dd45323ccff0d9fb43f378faf6127f&lt;/p&gt;</comment>
                            <comment id="103727" author="yujian" created="Fri, 16 Jan 2015 04:19:04 +0000"  >&lt;p&gt;Hi Jinshan,&lt;/p&gt;

&lt;p&gt;While back-porting the patch to Lustre b2_5 branch, I hit some conflicts. Could you please create a patch on b2_5 to resolve the similar failure in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5968&quot; title=&quot;racer test 1: mv operation hung&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5968&quot;&gt;&lt;del&gt;LU-5968&lt;/del&gt;&lt;/a&gt;? Thank you.&lt;/p&gt;</comment>
                            <comment id="103763" author="jlevi" created="Fri, 16 Jan 2015 18:11:07 +0000"  >&lt;p&gt;Patch landed to Master.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="25407">LU-5285</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="28074">LU-6088</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="27762">LU-5968</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzx3a7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>16934</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>