<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:30:16 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-3020] Lustre returns EINTR during writes when SA_RESTART is set</title>
                <link>https://jira.whamcloud.com/browse/LU-3020</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Lustre is returning EINTR when it should probably be returning ERESTARTSYS.&lt;/p&gt;

&lt;p&gt;The POSIX spec specifies that write system calls must react as follows to signals:&lt;br/&gt;
If a signal arrives before any data is written, the return is -1 and ERRNO is -EINTR.&lt;br/&gt;
If a signal arrives when some data has been written, the return is the amount of data written.&lt;/p&gt;

&lt;p&gt;When SA_RESTART is set, the write call should retry until it completes, and should not return EINTR.&lt;/p&gt;

&lt;p&gt;As I understand it, this is usually handled by returning ERESTARTSYS instead of EINTR - When this is returned, the kernel checks for SA_RESTART on the handler, if any.  If SA_RESTART is set, ERESTARTSYS is left alone and the system call is restarted.  If SA_RESTART is not set, ERESTARTSYS is changed to EINTR.  (See handle_signal in /arch/x86/kernel/signal.c)&lt;/p&gt;

&lt;p&gt;When using the attached reproducer, we consistently see -EINTR returned from cl_lock_state_wait in lustre/obdclass/cl_lock.c.  (In brief, the reproducer spawns a child, then starts a series of looped writes in the parent while the child sends SIGALRMS to the parent.  Once a write fails with EINTR, the parent and child print out and exit.)&lt;/p&gt;

&lt;p&gt;Turning on full tracing will let you see this return value (-4) coming from cl_lock_state_wait in the logs.  (I can attach logs if requested.)&lt;/p&gt;

&lt;p&gt;Moving on to implications and possible fixes:&lt;br/&gt;
The norm in most file system code appears to be to return ERESTARTSYS, except where a system call specifically cannot be restarted.  This snippet is from /fs/aio.c in the kernel:&lt;br/&gt;
&amp;#8212;&lt;br/&gt;
/*&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;There&apos;s no easy way to restart the syscall since other AIO&apos;s&lt;/li&gt;
	&lt;li&gt;may be already running. Just fail this IO with EINTR.&lt;br/&gt;
*/&lt;br/&gt;
if (unlikely(ret == -ERESTARTSYS || ret == -ERESTARTNOINTR ||&lt;br/&gt;
                             ret == -ERESTARTNOHAND || ret == -ERESTART_RESTARTBLOCK))&lt;br/&gt;
                        ret = -EINTR;&lt;br/&gt;
&amp;#8212;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;However, a quick search through the Lustre code shows Lustre returns EINTR far more often than ERESTARTSYS.  This is presumably because Lustre is more complex and therefore less restartable than most file systems.&lt;/p&gt;

&lt;p&gt;I&apos;ve made a patch I&apos;ll be submitting shortly which changes this specific instance of EINTR to ERESTARTSYS.  Some simple testing shows it fixes the problem described above and does not appear to cause other issues.&lt;/p&gt;

&lt;p&gt;I&apos;ll add a link to that patch here once it&apos;s submitted.&lt;/p&gt;

&lt;p&gt;One other note.  This problem occurs at exactly the same location during read system calls.  The same patch appears to resolve it.&lt;/p&gt;

&lt;p&gt;Two questions:&lt;br/&gt;
1) Is it most likely OK to return ERESTARTSYS at this particular location (cl_lock_state_wait)?  IE, can sys calls be safely restarted from here?&lt;/p&gt;

&lt;p&gt;2) The reproducer is a contrived test case, but the actual application that revealed this issue sends SIGALRM periodically during its normal operation.  We&apos;re concerned about the general reaction of Lustre to SIGALRM when a handler is set and whether or not we&apos;ll fix this spot, but see the problem return from another location.&lt;/p&gt;</description>
                <environment></environment>
        <key id="18064">LU-3020</key>
            <summary>Lustre returns EINTR during writes when SA_RESTART is set</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="keith">Keith Mannthey</assignee>
                                    <reporter username="paf">Patrick Farrell</reporter>
                        <labels>
                            <label>patch</label>
                    </labels>
                <created>Fri, 22 Mar 2013 20:53:24 +0000</created>
                <updated>Sun, 14 Aug 2016 09:59:40 +0000</updated>
                            <resolved>Tue, 23 Apr 2013 13:11:17 +0000</resolved>
                                    <version>Lustre 2.3.0</version>
                    <version>Lustre 2.4.0</version>
                                    <fixVersion>Lustre 2.4.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="54711" author="paf" created="Fri, 22 Mar 2013 21:06:31 +0000"  >&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/#change,5814&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,5814&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="54755" author="paf" created="Mon, 25 Mar 2013 14:34:29 +0000"  >&lt;p&gt;Updates from the patch review:&lt;br/&gt;
Failed test54c.  I&apos;m not sure if that&apos;s related to my patch or not, but I noticed a problem with the patch in any case.&lt;/p&gt;

&lt;p&gt;This is my review comment.  I&apos;ve pushed a patch with the new code below as of a few minutes ago.&lt;br/&gt;
&amp;#8212;&lt;br/&gt;
Upon further reflection, I&apos;m wondering about something in the block of code I modified: If the if statement is not entered, then ERESTARTSYS will be returned.&lt;/p&gt;

&lt;p&gt;Here&apos;s the current (modified) code:&lt;br/&gt;
&amp;#8212;&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;result = -ERESTARTSYS;
&lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (likely(!OBD_FAIL_CHECK(OBD_FAIL_LOCK_STATE_WAIT_INTR))) {
    cfs_waitq_wait(&amp;amp;waiter, CFS_TASK_INTERRUPTIBLE);
    &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (!cfs_signal_pending())
        result = 0;
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&amp;#8212;&lt;/p&gt;

&lt;p&gt;I&apos;m wondering if this might not be better, since we only want to return ERESTARTSYS when a signal is involved.. (Since ERESTARTSYS should never be returned to user space, and is processed by the signal handling code in the kernel &lt;span class=&quot;error&quot;&gt;&amp;#91;whether or not a user handler is set&amp;#93;&lt;/span&gt;):&lt;br/&gt;
&amp;#8212;&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;result = -EINTR;
&lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (likely(!OBD_FAIL_CHECK(OBD_FAIL_LOCK_STATE_WAIT_INTR))) {
    cfs_waitq_wait(&amp;amp;waiter, CFS_TASK_INTERRUPTIBLE);
    &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (!cfs_signal_pending()) {
        result = 0;
    }
    &lt;span class=&quot;code-keyword&quot;&gt;else&lt;/span&gt; {
        /* Returning ERESTARTSYS &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; a signal is found so
        * system calls can be restarted &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; the signal handler
        * calls &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; it. */
        result = -ERESTARTSYS;
    }
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&amp;#8212;&lt;/p&gt;</comment>
                            <comment id="56641" author="spitzcor" created="Fri, 19 Apr 2013 20:45:42 +0000"  >&lt;p&gt;Can this land now?  Or must we wait until after b2_4 is branched?&lt;/p&gt;</comment>
                            <comment id="56798" author="pjones" created="Tue, 23 Apr 2013 13:11:17 +0000"  >&lt;p&gt;Landed for 2.4&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="38766">LU-8494</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="19275">LU-3433</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="19797">LU-3581</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="12421" name="iotest.c" size="2388" author="paf" created="Fri, 22 Mar 2013 20:53:24 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvlzr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7344</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>