<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:48:32 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-11970] Using changelog reader causes fid2path process to lockup in kernel space </title>
                <link>https://jira.whamcloud.com/browse/LU-11970</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We are evaluating Starfish for file system usage detail and have successfully scanned numerous lustre and NFS file systems. However, when we start ingesting data via changelogs we are running into a condition where it will hang and the only resolution is to power cycle the client we are testing with. &lt;/p&gt;

&lt;p&gt;In trying to identify the problem, we found when we tried to unmount the file system that process would also hang and could not be aborted. &lt;/p&gt;

&lt;p&gt;lsof of the file system showed the hung process was stuck on&#160; /&amp;lt;file system&amp;gt;/.lustre/fid. Up until this point, we didn&apos;t even know that the hidden directory even existed nor its purpose. In scanning Jira, it is involve in lustre rsync and lfsck operations but not a lot of information regarding other roles it plays.&lt;/p&gt;

&lt;p&gt;One thing is certain: Starfish uses FIDs in there monitoring tools and we can see that .lustre/fid is being identified by the Starfish process.&lt;/p&gt;

&lt;p&gt;We&apos;re hoping we can get some additional information on what&apos;s going on with changelogs/.lustre.&lt;/p&gt;</description>
                <environment>Dell servers running TOSS (RHEL 7.5) IB connected to DDN SFA hardware</environment>
        <key id="54891">LU-11970</key>
            <summary>Using changelog reader causes fid2path process to lockup in kernel space </summary>
                <type id="9" iconUrl="https://jira.whamcloud.com/images/icons/issuetypes/undefined.png">Question/Request</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="tappro">Mikhail Pershin</assignee>
                                    <reporter username="jamervi">Joe Mervini</reporter>
                        <labels>
                    </labels>
                <created>Thu, 14 Feb 2019 18:45:26 +0000</created>
                <updated>Sat, 23 Mar 2019 14:49:09 +0000</updated>
                            <resolved>Sat, 23 Mar 2019 14:48:38 +0000</resolved>
                                    <version>Lustre 2.8.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="242003" author="pjones" created="Thu, 14 Feb 2019 20:37:19 +0000"  >&lt;p&gt;Mike&lt;/p&gt;

&lt;p&gt;Can you advise please?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="242004" author="pjones" created="Thu, 14 Feb 2019 20:37:52 +0000"  >&lt;p&gt;Joe&lt;/p&gt;

&lt;p&gt;Is this the Astra system?&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="242010" author="jamervi" created="Thu, 14 Feb 2019 21:35:19 +0000"  >&lt;p&gt;Peter&lt;/p&gt;

&lt;p&gt;No - this is on our regular production clusters. We&apos;re seeing this behavior on all three of the file systems.&#160;&lt;/p&gt;

&lt;p&gt;One thing we&apos;re curious about is why lustre is spitting out FIDs to a directory that is essentially unreadable. One clue is we had the system locked up on a FID that was not in the .lustre directory (although the .lustre directory was still being held by a process identified with lsof) but once the system was rebooted, using fid2path on that FID produced a no such file or directory message.&lt;/p&gt;

&lt;p&gt;When a file is deleted does it have any interaction with the .lustre directory? If so could it be a race condition? Another question is; is there any client side read operation that would cause a changelog change? &#160;&lt;/p&gt;</comment>
                            <comment id="242273" author="jamervi" created="Tue, 19 Feb 2019 18:29:32 +0000"  >&lt;p&gt;Peter,&lt;/p&gt;

&lt;p&gt;Has there been any activity on this ticket?&lt;/p&gt;

&lt;p&gt;Regards,&lt;/p&gt;

&lt;p&gt;Joe&lt;/p&gt;</comment>
                            <comment id="242339" author="tappro" created="Wed, 20 Feb 2019 14:31:39 +0000"  >&lt;p&gt;Joe, I am looking into this. Considering this is about Lustre 2.8.0 I am checking tickets which can be related to this problem, probably it is addressed already.&lt;/p&gt;</comment>
                            <comment id="242515" author="tappro" created="Fri, 22 Feb 2019 14:15:10 +0000"  >&lt;p&gt;This can be resolved by &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8821&quot; title=&quot;double find in mdt_path_current()&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8821&quot;&gt;&lt;del&gt;LU-8821&lt;/del&gt;&lt;/a&gt;, there was potential deadlock case in the code. The patch in that ticket can be updated for 2.8 if needed.&lt;/p&gt;

&lt;p&gt;As for other questions above - /.lustre/fid directory allows to get access to a file by its FID and is used often to get paths to that file from its LinkEA attribute what is fid2path exactly does. If you know FID of object you may access it and modify it but cannot delete it.&lt;br/&gt;
The only interaction with deleted files I can think of is that unlinked files which are still opened cannot be found by FID, you&apos;ll get &apos;no such file ..&apos; message while file is being still used by some process. See &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11638&quot; title=&quot;lfs fid2path should list open-unlinked files&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11638&quot;&gt;LU-11638&lt;/a&gt; for details.&lt;br/&gt;
As for changelog logged read operation I assume you meant &apos;non-modification&apos; operation, yes, we have CL_GETXATTR at least and also CL_DN_OPEN which are non-modification operations but can be recorded in changelog. Also CL_OPEN can be enabled to track OPENs.&lt;/p&gt;</comment>
                            <comment id="244586" author="pjones" created="Sat, 23 Mar 2019 14:48:38 +0000"  >&lt;p&gt;Believed to be a duplicate of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8821&quot; title=&quot;double find in mdt_path_current()&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8821&quot;&gt;&lt;del&gt;LU-8821&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="41466">LU-8821</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="53572">LU-11501</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10030" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic/Theme</customfieldname>
                        <customfieldvalues>
                                        <label>Lustre-2.8.0</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00bon:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>