<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:39:31 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4084] ll_inode_revalidate_fini()) failure -13</title>
                <link>https://jira.whamcloud.com/browse/LU-4084</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Our Lustre client reported the following errors continuously:&lt;/p&gt;

&lt;p&gt;Oct 10 16:06:13 avmmst1a kernel: Lustre: Mounted modelfs-client&lt;br/&gt;
Oct 10 16:06:13 avmmst1a kernel: Lustre: Mounted preposfs-client&lt;br/&gt;
Oct 10 16:06:18 avmmst1a kernel: LustreError: 54029:0:(mdc_locks.c:736:mdc_enqueue()) ldlm_cli_enqueue: -13&lt;br/&gt;
Oct 10 16:06:18 avmmst1a kernel: LustreError: 54029:0:(file.c:2196:ll_inode_revalidate_fini()) failure -13 inode 452984833&lt;br/&gt;
Oct 10 16:06:25 avmmst1a kernel: LustreError: 54029:0:(mdc_locks.c:736:mdc_enqueue()) ldlm_cli_enqueue: -13&lt;br/&gt;
Oct 10 16:06:25 avmmst1a kernel: LustreError: 54029:0:(file.c:2196:ll_inode_revalidate_fini()) failure -13 inode 159383553&lt;/p&gt;

&lt;p&gt;We&apos;ve googled but no hint. What&apos;s the cause of this? Thanks.&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Patrick&lt;/p&gt;</description>
                <environment>Lustre 2.1.5 on CentOS 6.3</environment>
        <key id="21347">LU-4084</key>
            <summary>ll_inode_revalidate_fini()) failure -13</summary>
                <type id="6" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11315&amp;avatarType=issuetype">Story</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="6">Not a Bug</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="ctcychan">Patrick Chan</reporter>
                        <labels>
                    </labels>
                <created>Thu, 10 Oct 2013 08:15:25 +0000</created>
                <updated>Sat, 16 Nov 2013 14:26:40 +0000</updated>
                            <resolved>Sat, 16 Nov 2013 14:26:40 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="68740" author="laisiyao" created="Thu, 10 Oct 2013 13:43:10 +0000"  >&lt;p&gt;Could you check messages on MDS, is there similar error reported?&lt;/p&gt;</comment>
                            <comment id="68764" author="bfaccini" created="Thu, 10 Oct 2013 19:57:06 +0000"  >&lt;p&gt;Could you also check if Clients and Servers share the same UIDs/GIDs databases ??&lt;/p&gt;</comment>
                            <comment id="68785" author="ctcychan" created="Fri, 11 Oct 2013 01:12:14 +0000"  >&lt;p&gt;Lai Siyao,&lt;/p&gt;

&lt;p&gt;There is no logs in MDS.&lt;/p&gt;


&lt;p&gt;Bruno,&lt;/p&gt;

&lt;p&gt;MDS is also a NIS server, both servers and clients use the same NIS server.&lt;/p&gt;

&lt;p&gt;Patrick&lt;/p&gt;</comment>
                            <comment id="68825" author="laisiyao" created="Fri, 11 Oct 2013 14:33:55 +0000"  >&lt;p&gt;The next time you see this error, could you use `lctl dk` on MDS to dump debug logs right after that?&lt;/p&gt;</comment>
                            <comment id="68955" author="ctcychan" created="Tue, 15 Oct 2013 01:39:03 +0000"  >&lt;p&gt;I run &apos;lctl dk&apos; on MDS, the file lustre_debug.log.gz was uploaded to ftp.whamcloud.com/uploads.&lt;/p&gt;

&lt;p&gt;By the way, I try to find out if the culprit is the inode 159383553. I perform a full scan on the inode 159383553, but can&apos;t find any file in the whole lustre file system.&lt;/p&gt;

&lt;p&gt;client&amp;gt;  find /lustre_mnt_point -inum 159383553&lt;/p&gt;</comment>
                            <comment id="68967" author="laisiyao" created="Tue, 15 Oct 2013 07:25:57 +0000"  >&lt;p&gt;I don&apos;t find any error messages in this log, which means -13 (-EACCES) is not from disk filesystem, because mdt_getattr_internal() will print an error if it gets attr from disk fails. Did you dump the debug logs right after you see this failure? Because debug log size is limited, and it only contains the most recent logs. &lt;/p&gt;

&lt;p&gt;To further understand this failure, could you enable more debug on MDS with `lctl set_param debug=+inode` and `lctl set_param debug=+trace` which will enable debug for inode access and function trace. And you can use `lctl get_param debug_mb` and `lctl set_param debug_mb=&amp;lt;debug_size&amp;gt;` to check and increase debug memory size.&lt;/p&gt;

&lt;p&gt;You can also dump debug logs on client, which may help you find the file name.&lt;/p&gt;</comment>
                            <comment id="68971" author="ctcychan" created="Tue, 15 Oct 2013 07:58:30 +0000"  >&lt;p&gt;The &apos;ll_inode_revalidate_fini())&apos; failure message appears occasionally, I need to watch /var/log/messages on client and capture the debug buffer on MDS promptly.&lt;/p&gt;

&lt;p&gt;The log file lustre_debug2.log.gz is uploaded. This is the debug buffer generated about 10 seconds after &apos;ll_inode_revalidate_fini()) failure&apos; appears on client.&lt;/p&gt;

&lt;p&gt;As you suggested, I&apos;ve add inode and trace into debug parameter, debug buffer size is increased to 512MB.&lt;/p&gt;

</comment>
                            <comment id="68980" author="ctcychan" created="Tue, 15 Oct 2013 13:49:27 +0000"  >&lt;p&gt;Just already discovered that both inodes mentioned in the error (159383553 &amp;amp; 452984833) are owned by top level directory of mount point.&lt;/p&gt;</comment>
                            <comment id="69086" author="laisiyao" created="Wed, 16 Oct 2013 08:28:02 +0000"  >&lt;p&gt;It&apos;ll be better to do this in a script if system is busy, in case the old logs get discarded.&lt;/p&gt;</comment>
                            <comment id="69087" author="ctcychan" created="Wed, 16 Oct 2013 08:35:15 +0000"  >&lt;p&gt;We&apos;ve solved the problem.&lt;/p&gt;

&lt;p&gt;The error is displayed, when Nagios periodically runs the plugin check_disk on lustre client to check the disk capacity of the Lustre file system.&lt;/p&gt;

&lt;p&gt;However, the user nagios only exist on master node (lustre client). MDS does not have such user.&lt;/p&gt;

&lt;p&gt;After adding local user nagios on MDS, this error does not appear anymore.&lt;/p&gt;</comment>
                            <comment id="71718" author="pjones" created="Sat, 16 Nov 2013 14:26:40 +0000"  >&lt;p&gt;That&apos;s great! Thanks for letting us know.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw5b3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10976</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>