<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:56:45 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12915] Add a debugfs variable listing fids with discarded dirty pages</title>
                <link>https://jira.whamcloud.com/browse/LU-12915</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Pages that are dirty at the time of a client eviction are discarded, which may result in file corruption. Currently, the Lustre client issues a warning message to the console log that identifies the fid of the file containing the discarded dirty page. It also tries to include the name of the file in the warning message. There are problems with this scheme:&lt;br/&gt;
 1. Trying to get the file name can cause a deadlock. See &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12522&quot; title=&quot;Deadlock: ptlrpcd daemon blocked in osc_extent_wait&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12522&quot;&gt;LU-12522&lt;/a&gt;.&lt;br/&gt;
 2. The console log is not accessible to application users.&lt;br/&gt;
 3. The message has no link to the job or application that was affected by the warning.&lt;/p&gt;

&lt;p&gt;The proposal here is to add a variable to the debugfs (/sys/kernel/debug/lustre/llite.&amp;lt;fs&amp;gt;.discard_list) that lists the fids of files with discarded dirty pages. When a dirty page is discarded, the fid, jobid, and inode of the current file are added to the list.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;root@vmcentos7 lustre-ffff8cf1aa3a4000]# lctl get_param llite.*.discard_list
llite.lustre-ffff8cf1aa3a4000.discard_list=
Jobs with discarded dirty pages:timestamp, jobid, filesystem, fid, discarded page count, filename
1572294777.000402500  dmesg.0                           lustre    [0x200000401:0x4:0x0]  256  amk/testing/discards1
1572294777.000404336  dmesg.0                           lustre    [0x200000401:0x8:0x0]  256  amk/testing/discards5
1572294777.000406094  dmesg.0                           lustre    [0x200000401:0xc:0x0]  256  amk/testing/discards9
1572294777.000407841  dmesg.0                           lustre    [0x200000401:0x10:0x0]  256  amk/testing/discards13
1572294777.000409620  dmesg.0                           lustre    [0x200000401:0x14:0x0]  256  amk/testing/discards17
1572294777.000411458  dmesg.0                           lustre    [0x200000401:0x6:0x0]  256  amk/testing/discards3
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Either an admin (or eventually an application user) script or a job manager can examine the discard lists on each client at the end of a job to determine whether any dirty pages have been lost and inform the user.&lt;/p&gt;

&lt;p&gt;Note the implementation limits the number of fids reported to 128 per file system. If more files than that have discarded dirty pages, the oldest entries in the discard_list are re-used. A discard_list can be cleared/emptied by writing anything to the debugfs variable (set_param llite.*.discard_list=clear).&lt;/p&gt;

&lt;p&gt;The warning message will still be issued to the console log when a dirty page is discarded. The message will now only contain the fid; no attempt will be made to fetch the file name. Thus &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12522&quot; title=&quot;Deadlock: ptlrpcd daemon blocked in osc_extent_wait&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12522&quot;&gt;LU-12522&lt;/a&gt; is resolved.&lt;/p&gt;

&lt;p&gt;Identified drawbacks of the design include:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;A user still needs root privileges to access the list of files with discarded pages. When a UI for multiple line output is defined, discard_list could be moved to /sys/fs/lustre where permissions can be set more flexibly.&lt;/li&gt;
	&lt;li&gt;Output does not display the mount point in the file name. (Existing functions to retrieve the mount point are written to run in user, rather than kernel, space. The mount point can easily be identified and added to the discard_list info using Linux utilities.)&lt;/li&gt;
	&lt;li&gt;The file name will not be displayed if the file is deleted before discard_list is read.&lt;/li&gt;
	&lt;li&gt;Lustre must be manually directed to clear the discard lists.&lt;/li&gt;
	&lt;li&gt;A discard_list is a fixed size so if the max is exceeded not all files with discarded dirty pages will be reported in the list.&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="57264">LU-12915</key>
            <summary>Add a debugfs variable listing fids with discarded dirty pages</summary>
                <type id="2" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11311&amp;avatarType=issuetype">New Feature</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="amk">Ann Koehler</reporter>
                        <labels>
                    </labels>
                <created>Tue, 29 Oct 2019 19:55:33 +0000</created>
                <updated>Mon, 1 May 2023 16:47:52 +0000</updated>
                                                                                <due></due>
                            <votes>0</votes>
                                    <watches>1</watches>
                                                                            <comments>
                            <comment id="257290" author="amk" created="Tue, 29 Oct 2019 20:11:59 +0000"  >&lt;p&gt;Patch:  &lt;a href=&quot;https://review.whamcloud.com/#/c/36607/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/36607/&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="56293">LU-12522</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00opr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>