<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:29:49 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-16766] Combine some kernel process names for jobid</title>
                <link>https://jira.whamcloud.com/browse/LU-16766</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Reduce the long kernel thread names like &quot;&lt;tt&gt;kworker/CPU:ID&lt;/tt&gt;&quot; to just &quot;kworker&quot;, and &quot;&lt;tt&gt;ll_sa_PID&lt;/tt&gt;&quot; to &quot;&lt;tt&gt;ll_sa&lt;/tt&gt;&quot;, since it is actually less useful to have the full kernel thread ID instead of aggregating these into a single process name on the stats. There may be other similar kernel thread names that should be abbreviated.&lt;/p&gt;

&lt;p&gt;Also, for statahead and similar Lustre threads that are generating RPCs on behalf of user processes, they should be properly accounted to the user/client.  &lt;/p&gt;</description>
                <environment></environment>
        <key id="75699">LU-16766</key>
            <summary>Combine some kernel process names for jobid</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bertschinger">Thomas Bertschinger</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                            <label>lug23dd</label>
                    </labels>
                <created>Mon, 24 Apr 2023 19:21:45 +0000</created>
                <updated>Wed, 7 Feb 2024 19:26:38 +0000</updated>
                            <resolved>Thu, 31 Aug 2023 15:11:39 +0000</resolved>
                                    <version>Lustre 2.16.0</version>
                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="382038" author="gerrit" created="Thu, 10 Aug 2023 18:48:57 +0000"  >&lt;p&gt;&quot;Thomas Bertschinger &amp;lt;bertschinger@lanl.gov&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/51919&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/51919&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16766&quot; title=&quot;Combine some kernel process names for jobid&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16766&quot;&gt;&lt;del&gt;LU-16766&lt;/del&gt;&lt;/a&gt; obdclass: trim kernel thread names in jobids&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: ce986c8cd5a7ef53819a82cf4d38fdfd4f532a4e&lt;/p&gt;</comment>
                            <comment id="382043" author="JIRAUSER18444" created="Thu, 10 Aug 2023 19:03:22 +0000"  >&lt;p&gt;I&apos;ve uploaded a patch in progress for this but wanted to ask some design questions that may be broader than the patch I&apos;ve submitted for this issue.&lt;/p&gt;

&lt;p&gt;First, the description here has:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;statahead and similar Lustre threads ... should be properly accounted to the user/client.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;It looks like &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16781&quot; title=&quot;account jobids to original process&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16781&quot;&gt;LU-16781&lt;/a&gt; is for this issue so I think this can be handled with a patch on that ticket.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16781&quot; title=&quot;account jobids to original process&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16781&quot;&gt;LU-16781&lt;/a&gt; also says:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;For jobid_name_is_valid() we may consider to reduce the exclusions for kernel processes. Firstly, this exclusion doesn&apos;t always working properly, since kworker and ll_sa tasks still show up&lt;br/&gt;
in the server stats. Secondly, this hides the real presence of RPCs sent to the server, so job_stats are not showing the full picture of what is generating the load&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;I have some questions about the reasoning behind jobid_name_is_valid(), which could be relevant to this patch.&lt;/p&gt;

&lt;p&gt;It looks like that check is only called if obd_jobid_var refers to an actual environment var and not one of the special values (nodelocal, session, procname_uid). Why is this the only case where the exclusion matters?&lt;/p&gt;

&lt;p&gt;To me it seems that the decision of whether a kernel thread should be included in jobstats is independent of whether the jobid setting is an env var, per-session var, or anything else. But I may not have the full picture here.&lt;/p&gt;

&lt;p&gt;Can you clarify what the intent of the exclusion is?&lt;/p&gt;

&lt;p&gt;If the purpose of the exclusion is just to avoid checking the process&apos;s environment in the event the process is a kernel thread,&lt;br/&gt;
would it be more direct to replace the check for a hardcoded list of names with a check for &lt;tt&gt;current-&amp;gt;flags &amp;amp; PF_KTHREAD&lt;/tt&gt;? Or would that be overly broad?&lt;/p&gt;</comment>
                            <comment id="382048" author="adilger" created="Thu, 10 Aug 2023 20:38:20 +0000"  >&lt;p&gt;The purpose of &lt;tt&gt;jobid_name_is_valid()&lt;/tt&gt; is to avoid using the procname/environment from those threads when generating the jobid, and instead get this information from the inode that is being processed. Also, there are some &quot;housekeeping&quot; RPCs like pings that  are  excluded  since  they  might otherwise flood the server logs.  I &lt;em&gt;suspect&lt;/em&gt; there are still a couple of bugs in how the jobid name is generated, and we &lt;em&gt;should&lt;/em&gt; be using the application process jobid that was stored in the file inode for the kernel threads to use, but somehow this is not happening correctly in all cases.&lt;/p&gt;

&lt;p&gt;I think &lt;tt&gt;PF_KTHREAD&lt;/tt&gt; is overly broad to deny generating &lt;b&gt;any&lt;/b&gt; jobid for an RPC, since ptlrpcd is a kernel thread and it is generating many client RPCs.  However, it may be that &lt;tt&gt;PF_KTHREAD&lt;/tt&gt; is a good indicator that we shouldn&apos;t be generating the jobid from the current thread, but rather from the file of interest in the RPC...&lt;/p&gt;

&lt;p&gt;This is unfortunately a bit vague, since it has been some time since I was debugging this code.  It might be possible to run with &quot;&lt;tt&gt;+rpctrace&lt;/tt&gt;&quot; debugging enabled on the client and see what RPCs are being generated by &lt;tt&gt;kworker&lt;/tt&gt; and what they can use to generate a better jobid for the   RPC.&lt;/p&gt;</comment>
                            <comment id="384341" author="gerrit" created="Thu, 31 Aug 2023 06:38:07 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/51919/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/51919/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16766&quot; title=&quot;Combine some kernel process names for jobid&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16766&quot;&gt;&lt;del&gt;LU-16766&lt;/del&gt;&lt;/a&gt; obdclass: trim kernel thread names in jobids&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 8a9c503c002d08d0587894a748761e30c1b9a445&lt;/p&gt;</comment>
                            <comment id="384427" author="pjones" created="Thu, 31 Aug 2023 15:11:39 +0000"  >&lt;p&gt;Landed for 2.16&lt;/p&gt;</comment>
                            <comment id="402604" author="gerrit" created="Mon, 5 Feb 2024 02:43:21 +0000"  >&lt;p&gt;&quot;Andreas Dilger &amp;lt;adilger@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/53904&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/53904&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16766&quot; title=&quot;Combine some kernel process names for jobid&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16766&quot;&gt;&lt;del&gt;LU-16766&lt;/del&gt;&lt;/a&gt; llite: fix JobID for readahead threads&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: c213a61f4a9d1e89fa0fd15fdf85a1954c65b22a&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="75698">LU-16765</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="75700">LU-16767</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="75846">LU-16781</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="80702">LU-17512</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i03jp3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>