<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:34:30 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-17323] fork() leaks ERESTARTNOINTR (errno 513) to user application</title>
                <link>https://jira.whamcloud.com/browse/LU-17323</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;When using file locks on a Lustre mount with the &apos;flock&apos; mount option, fork()&lt;br/&gt;
can leak ERESTARTNOINTR to a user application. &#160;The fork() system call checks&lt;br/&gt;
if a signal is pending, and if so, cleans up everything it did and returns&#160;&lt;br/&gt;
ERESTARTNOINTR. &#160;The kernel transparently restarts the fork() from scratch,&lt;br/&gt;
the user application is never supposed to get the ERESTARTNOINTR errno.&lt;/p&gt;

&lt;p&gt;The fork() cleanup code calls exit_files() which calls Lustre code. &#160;I&apos;m not&lt;br/&gt;
positive what the problem is at a low level. &#160;It may be that the Lustre code&lt;br/&gt;
clears the TIF_SIGPENDING flag, which prevents the kernel from restarting the&lt;br/&gt;
fork() and it leaks the ERESTARTNOINTR errno to the user application.&lt;/p&gt;

&lt;p&gt;It seems there has to be multiple threads involved. &#160;My reproducer has two&lt;br/&gt;
threads. Thread 1 does fork() calls in an infinite loop, spawning children&lt;br/&gt;
that exit after a random number of seconds. &#160;Thread 2 sleeps for a random&lt;br/&gt;
number of seconds in an infinite loop. &#160;There is a SIGCHLD handler set up and&lt;br/&gt;
both threads can handle SIGCHLD signals. &#160;The fork() gets interrupted by&lt;br/&gt;
pending SIGCHLD signals from exiting children. &#160;I think thread 2 has to handle&lt;br/&gt;
the SIGCHLD signal for the problem to happen. &#160;If thread 2 has SIGCHLD signals&lt;br/&gt;
blocked, the problem never happens.&lt;/p&gt;

&lt;p&gt;The problem doesn&apos;t reproduce with the &apos;localflock&apos; mount option, so we&lt;br/&gt;
believe &apos;localflock&apos; is safe from this issue.&lt;/p&gt;


&lt;p&gt;We&apos;ve seen this on RHEL6, RHEL7/CentOS7 kernels,&lt;br/&gt;
and Lustre 2.11.0, 2.12.5 and 2.12.6&lt;br/&gt;
Lustre 2.12.0 does not reproduce the issue.&lt;/p&gt;


&lt;p&gt;Steps to reproduce:&lt;/p&gt;

&lt;p&gt;1) Lustre mount must be using &apos;flock&apos; mount option.&lt;br/&gt;
2) gcc -o repro ./repro.c -lpthread&lt;br/&gt;
3) Run reproducer:&lt;/p&gt;

&lt;p&gt;Problem usually reproduces within 5-60 seconds.&lt;br/&gt;
Reproducer runs indefinitely or until the issue occurs,&#160;&lt;br/&gt;
enter Ctrl-C to quit&lt;/p&gt;

&lt;p&gt;&amp;gt; touch /lustre_mnt/testfile.txt&lt;br/&gt;
&amp;gt; ./repro /lustre_mnt/testfile.txt&lt;br/&gt;
Fork returned -1, errno = 513, exiting...&lt;/p&gt;

&lt;p&gt;Use POSIX style read lock&lt;br/&gt;
&amp;gt; ./repro /lustre_mnt/testfile.txt posix&lt;br/&gt;
Fork returned -1, errno = 513, exiting...&lt;/p&gt;

&lt;p&gt;Use BSD style read lock&lt;br/&gt;
&amp;gt; ./repro /lustre_mnt/testfile.txt flock&lt;br/&gt;
Fork returned -1, errno = 513, exiting...&lt;/p&gt;

&lt;p&gt;Don&apos;t lock at all (this won&apos;t reproduce and will run indefinitely)&lt;br/&gt;
&amp;gt; ./repro /lustre_mnt/testfile.txt none&lt;/p&gt;

&lt;p&gt;NOTE: be aware the reproducer can exhaust your maxprocs limit&lt;/p&gt;</description>
                <environment>RHEL6, RHEL7/CentOS7 (various kernels)</environment>
        <key id="79244">LU-17323</key>
            <summary>fork() leaks ERESTARTNOINTR (errno 513) to user application</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="mikedoo4">Mike D</reporter>
                        <labels>
                    </labels>
                <created>Wed, 29 Nov 2023 21:59:25 +0000</created>
                <updated>Wed, 31 Jan 2024 21:41:14 +0000</updated>
                                            <version>Lustre 2.11.0</version>
                    <version>Lustre 2.12.5</version>
                    <version>Lustre 2.12.6</version>
                    <version>Lustre 2.12.9</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="394819" author="paf0186" created="Wed, 29 Nov 2023 22:22:55 +0000"  >&lt;p&gt;Hi Mike,&lt;/p&gt;

&lt;p&gt;I think I know the issue you&apos;re hitting here and its root cause.&#160; Historically, Lustre had to do some nasty things with signal handling due to the lack of the ability to express &quot;waiting&quot; without contributing to load, and we also rolled our own for some waiting primitives that didn&apos;t exist at the time.&#160; This results in some weird behavior with certain signals in some cases.&#160; I saw this with ptrace, but this problem has a very similar feel to it.&lt;/p&gt;

&lt;p&gt;Neil Brown did a thorough rework of signal handling and task waiting in Lustre, spread over a number of patches (If it were just one, I would link it), which I believe landed for 2.13 but I don&apos;t think was ported to 2.12 (it was seen as code cleanup rather than fixing specific bugs).&#160; (I think your not hitting the problem with 2.12 is probably a coincidence/timing change.)&lt;/p&gt;

&lt;p&gt;2.15 is the current maintenance release, so it would be good to see if you can reproduce this with 2.15, which has the full bevy of wait and signal handling changes.&lt;/p&gt;</comment>
                            <comment id="394826" author="adilger" created="Wed, 29 Nov 2023 22:45:16 +0000"  >&lt;p&gt;Mike, thank you for the detailed analysis (including a reproducer!). &lt;/p&gt;

&lt;p&gt;Since RHEL6/7 are basically EOL at this point, this issue would only be of interest if the problem persists in RHEL8/9 since we&apos;ve run the full lifetime of EL6/7 without hitting this problem in actual production usage (or at least nothing has been reported to us up to this point).&lt;/p&gt;

&lt;p&gt;I don&apos;t see &lt;tt&gt;ERESTARTNOINTR&lt;/tt&gt; used or returned anywhere in the Lustre code, so this error code is definitely coming from the kernel &lt;tt&gt;fork()&lt;/tt&gt; handling.  There is indeed code in &lt;tt&gt;libcfs/include/libcfs/linux/linux-wait.h&lt;/tt&gt; that is clearing &lt;tt&gt;TIF_SIGPENDING&lt;/tt&gt; in the RPC completion wait routines (&lt;tt&gt;__&lt;em&gt;wait_event_idle()&lt;/tt&gt; or &lt;tt&gt;&lt;/em&gt;__wait_event_lifo()&lt;/tt&gt;, which are conditionally used depending on the kernel version in use.  I suspect these routines are clones of similar code from newer kernels just for compatibility use with older kernels, so there may be some variations.&lt;/p&gt;

&lt;p&gt;If the problem still persists with newer kernels and Lustre releases then it would be useful to continue investigation and add the &lt;tt&gt;repro.c&lt;/tt&gt; test case into our regression test suite.&lt;/p&gt;</comment>
                            <comment id="394832" author="paf0186" created="Wed, 29 Nov 2023 22:52:35 +0000"  >&lt;p&gt;Interesting that the signal clearing is in those macros.&#160; Those are the ones &lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=neilb&quot; class=&quot;user-hover&quot; rel=&quot;neilb&quot;&gt;neilb&lt;/a&gt; ported in to Lustre to replace our hand rolled stuff, so maybe the issue isn&apos;t fixed.&#160; Or perhaps Neil knows - should&apos;ve tagged him earlier.&lt;/p&gt;</comment>
                            <comment id="395384" author="JIRAUSER18805" created="Mon, 4 Dec 2023 18:42:29 +0000"  >&lt;p&gt;I plan to try Lustre 2.15 client (assuming that will connect to the 2.12.x server) but it will probably be several weeks before I can try it and report back.&#160; I don&apos;t know if the problem occurs with RHEL8/9 as I don&apos;t have an easy way to test that.&lt;/p&gt;</comment>
                            <comment id="397034" author="JIRAUSER18805" created="Fri, 15 Dec 2023 19:54:23 +0000"  >&lt;p&gt;I tried the latest Lustre 2.15 client and have not been able to reproduce the issue on CentOS7.&#160; However, I did notice a problem (I haven&apos;t investigated it much yet):&lt;/p&gt;

&lt;p&gt;&amp;gt; gcc hello.c&lt;/p&gt;

&lt;p&gt;&amp;gt; ./a.out&lt;/p&gt;

&lt;p&gt;./a.out: Command not found.&lt;/p&gt;

&lt;p&gt;&amp;gt; /bin/ls a.out&lt;/p&gt;

&lt;p&gt;a.out&lt;/p&gt;

&lt;p&gt;&amp;gt; ./a.out&lt;/p&gt;

&lt;p&gt;hello world&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;The file isn&apos;t there until I do the ls.&#160; This is reproducible every time.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;Is it recommended to use a Lustre 2.15 client with 2.12.x servers?&lt;/p&gt;</comment>
                            <comment id="397045" author="adilger" created="Fri, 15 Dec 2023 20:19:57 +0000"  >&lt;p&gt;Mike, please file your gcc issue in a separate Jira ticket, or it will be lost here. There should be proper interop between 2.15 clients and 2.12 servers.&lt;/p&gt;</comment>
                            <comment id="397047" author="paf0186" created="Fri, 15 Dec 2023 20:42:40 +0000"  >&lt;p&gt;Also, since it&apos;s easy to reproduce, but doesn&apos;t happen in our usual test environments, could you grab client debug logs and attach them to the new LU?&lt;/p&gt;

&lt;p&gt;Something like this would do the trick:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;DEBUGMB=`lctl get_param -n debug_mb`
lctl clear
lctl set_param *debug=-1 debug_mb=10000
lctl mark &quot;running gcc&quot;
gcc hello.c
lctl mark &quot;running a.out&quot;
./a.out
lctl mark &quot;running ls&quot;
/bin/ls a.out
lctl mark &quot;running a.out again&quot;
./a.out
# Write out logs
lctl dk &amp;gt; /tmp/log
# Set debug to minimum, this leaves on error, warning, etc
lctl set_param debug=0
lctl set_param debug_mb=$DEBUGMB &lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(Will probably want to compress that before attaching it)&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="398899" author="JIRAUSER18805" created="Mon, 8 Jan 2024 23:24:23 +0000"  >&lt;p&gt;I&apos;ll note that csh vs bash behavior is different.&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;csh case (a.out not found until the ls is run)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;$ gcc hello.c&lt;/p&gt;

&lt;p&gt;$ ./a.out&lt;/p&gt;

&lt;p&gt;./a.out: Command not found&lt;/p&gt;

&lt;p&gt;$ ./a.out&lt;br/&gt;
./a.out: Command not found&lt;/p&gt;

&lt;p&gt;$ ./a.out&lt;/p&gt;

&lt;p&gt;./a.out: Command not found&lt;/p&gt;

&lt;p&gt;$ ls a.out&lt;/p&gt;

&lt;p&gt;a.out&lt;/p&gt;

&lt;p&gt;$ ./a.out&lt;/p&gt;

&lt;p&gt;hello world&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;bash case (a.out works after the bad ELF interpreter error)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;$ gcc hello.c&lt;/p&gt;

&lt;p&gt;$ ./a.out&lt;/p&gt;

&lt;p&gt;bash: ./a.out: /lib64/ld-linux-x86-64.so.2: bad ELF interpreter: No such file or directory&lt;/p&gt;

&lt;p&gt;$ ./a.out&lt;/p&gt;

&lt;p&gt;hello world&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;The main thing I&apos;ve noticed in the client debug log:&lt;/p&gt;

&lt;p&gt;running a.out&lt;/p&gt;

&lt;p&gt;file.c:2012:ll_file_read_iter() file a.out:&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2000XXXXX:0x9:0x0&amp;#93;&lt;/span&gt;, ppos: 0, count: 80&lt;/p&gt;

&lt;p&gt;file.c:2012:ll_file_read_iter() file a.out:&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2000XXXXX:0x9:0x0&amp;#93;&lt;/span&gt;, ppos: 6456, count: 1984&lt;/p&gt;

&lt;p&gt;file.c:2012:ll_file_read_iter() file a.out:&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2000XXXXX:0x9:0x0&amp;#93;&lt;/span&gt;, ppos: 6186, count: 268&lt;/p&gt;

&lt;p&gt;file.c:2012:ll_file_read_iter() file a.out:&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2000XXXXX:0x9:0x0&amp;#93;&lt;/span&gt;, ppos: 64, count: 504&lt;/p&gt;

&lt;p&gt;file.c:2012:ll_file_read_iter() file a.out:&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2000XXXXX:0x9:0x0&amp;#93;&lt;/span&gt;, ppos: 568, count: 200&lt;/p&gt;

&lt;p&gt;running a.out again&lt;/p&gt;

&lt;p&gt;file.c:2012:ll_file_read_iter() file a.out:&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2000XXXXX:0x9:0x0&amp;#93;&lt;/span&gt;, ppos: 0, count: 128&lt;/p&gt;

&lt;p&gt;file.c:2012:ll_file_read_iter() file a.out:&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2000XXXXX:0x9:0x0&amp;#93;&lt;/span&gt;, ppos: 64, count: 504&lt;/p&gt;

&lt;p&gt;file.c:2012:ll_file_read_iter() file a.out:&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2000XXXXX:0x9:0x0&amp;#93;&lt;/span&gt;, ppos: 568, count: 28&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;I may be able to get a full log if needed but I don&apos;t know when I&apos;ll have time to get back to this.&#160; I&apos;ll look at entering a new ticket with some of this info.&#160; Thanks for the help so far.&lt;/p&gt;</comment>
                            <comment id="402065" author="JIRAUSER18805" created="Wed, 31 Jan 2024 21:41:14 +0000"  >&lt;p&gt;Linked &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17405&quot; title=&quot;Executable created with gcc gives ELF interpreter error (2.15 client w/ 2.12 server)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17405&quot;&gt;LU-17405&lt;/a&gt; which tracks the side issue uncovered with gcc when testing the Lustre 2.15 client with 2.12.x servers.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="79884">LU-17405</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="51111" name="repro.c" size="5318" author="mikedoo4" created="Wed, 29 Nov 2023 21:57:34 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i0438n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>