<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:41:33 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4308] MPI job causes errors &quot;binary changed while waiting for the page fault lock&quot;</title>
                <link>https://jira.whamcloud.com/browse/LU-4308</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;When a MPI job is run, we see many of these messages &quot;binary x changed while waiting for the page fault lock.&quot; Is this normal behavior or not? It was also reported here.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://lists.01.org/pipermail/hpdd-discuss/2013-October/000560.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://lists.01.org/pipermail/hpdd-discuss/2013-October/000560.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Nov 25 13:46:50 rhea25 kernel: Lustre: 105703:0:vvp_io.c:699:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20000f81c:0x18:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;br/&gt;
Nov 25 13:46:53 rhea25 kernel: Lustre: 105751:0:(vvp_io.c:699:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20000f81c:0x19:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;br/&gt;
Nov 25 13:46:57 rhea25 kernel: Lustre: 105803:0:(vvp_io.c:699:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20000f81c:0x1a:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;br/&gt;
Nov 25 13:46:57 rhea25 kernel: Lustre: 105803:0:(vvp_io.c:699:vvp_io_fault_start()) Skipped 1 previous similar message&lt;br/&gt;
Nov 25 13:47:00 rhea25 kernel: Lustre: 105846:0:(vvp_io.c:699:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20000f81c:0x1b:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;br/&gt;
Nov 25 13:47:00 rhea25 kernel: Lustre: 105846:0:(vvp_io.c:699:vvp_io_fault_start()) Skipped 2 previous similar messages&lt;br/&gt;
Nov 25 13:47:07 rhea25 kernel: Lustre: 105942:0:(vvp_io.c:699:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20000f81c:0x1d:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;/p&gt;</description>
                <environment>RHEL 6.4/MLNX OFED 2.0.2.6.8.10</environment>
        <key id="22232">LU-4308</key>
            <summary>MPI job causes errors &quot;binary changed while waiting for the page fault lock&quot;</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="blakecaldwell">Blake Caldwell</reporter>
                        <labels>
                    </labels>
                <created>Mon, 25 Nov 2013 20:16:17 +0000</created>
                <updated>Thu, 28 Jan 2016 14:12:19 +0000</updated>
                            <resolved>Thu, 14 Aug 2014 16:27:26 +0000</resolved>
                                    <version>Lustre 2.4.1</version>
                                    <fixVersion>Lustre 2.5.3</fixVersion>
                                        <due></due>
                            <votes>5</votes>
                                    <watches>23</watches>
                                                                            <comments>
                            <comment id="72255" author="pjones" created="Mon, 25 Nov 2013 21:06:42 +0000"  >&lt;p&gt;Hi Blake&lt;/p&gt;

&lt;p&gt;I will get an engineer to comment asap, but just to clarify - is this really an S1 situation (i.e. a production cluster is completely inoperational)?&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="72256" author="blakecaldwell" created="Mon, 25 Nov 2013 21:11:52 +0000"  >&lt;p&gt;It should be severity 4&lt;/p&gt;</comment>
                            <comment id="72259" author="pjones" created="Mon, 25 Nov 2013 21:20:31 +0000"  >&lt;p&gt;Thanks for clarifying Blake.&lt;/p&gt;

&lt;p&gt;Alex&lt;/p&gt;

&lt;p&gt;Could you please comment?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="74798" author="bobijam" created="Mon, 13 Jan 2014 02:49:25 +0000"  >&lt;p&gt;Blake,&lt;/p&gt;

&lt;p&gt;This message means that a mmapped executable file is under read while other threads may change its contents. Is this what the MPI job meant to do, ie. generating executable files while other jobs reading it?&lt;/p&gt;</comment>
                            <comment id="77478" author="pjones" created="Thu, 20 Feb 2014 15:39:56 +0000"  >&lt;p&gt;As per ORNL ok to close&lt;/p&gt;</comment>
                            <comment id="78695" author="hilljjornl" created="Fri, 7 Mar 2014 13:52:03 +0000"  >&lt;p&gt;Hate to open something we said to close last week, but we have another occurrence of this issue on the same cluster. We are consulting with the sysadmin for that cluster, and likely with the user to discuss the question from Zhenyu. &lt;/p&gt;

&lt;p&gt;Mar  6 16:49:55 rhea101 kernel: Lustre: 24431:0:(vvp_io.c:699:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20006c6eb:0xc:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;/p&gt;

&lt;p&gt;&amp;#8211;&lt;br/&gt;
-Jason&lt;/p&gt;</comment>
                            <comment id="82755" author="jstroik" created="Tue, 29 Apr 2014 16:41:30 +0000"  >&lt;p&gt;We are observing a similar issue and sometimes see the same error. It only happens when we run specific jobs that use MPI with OpenMP threads.  Those same jobs complete when using pure MPI.&lt;/p&gt;

&lt;p&gt;Symptoms:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;The file in question cannot be read on some specific clients (but can on most other clients)&lt;/li&gt;
	&lt;li&gt;On those nodes where it cannot be read, processes that need to access /proc/PID/cmdline for the process executing that file block (w, ps, etc).&lt;/li&gt;
	&lt;li&gt;File system can no longer unmount.&lt;/li&gt;
	&lt;li&gt;sometimes clients will have open FIDs that cannot resolve to paths when we get the list of FIDs held open by the client on the MDS.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Effectively, we have to power cycle or reboot -fn to get the nodes in question back up.&lt;/p&gt;

&lt;p&gt;The file in question can be read by other lustre clients. We tested running lustre client and server 2.4.2 and 2.5.1. Same results for both. To reduce the chance this is an MPI bug, we also tested on two versions of Intel MPI (4.1.0.024 and 4.1.3.049) and Intel compilers 13.1 and 14.0-1. To reduce the risk of race conditions, we pin MPI threads to a single socket.&lt;/p&gt;

&lt;p&gt;Here is a bit of strace for cp:&lt;/p&gt;

&lt;p&gt;===============&lt;br/&gt;
stat(&quot;/scratch/short/jimj/s4_1534/stmp_2013062412_gdas_fcst1/global_fcst&quot;, &lt;/p&gt;
{st_mode=S_IFREG|0755, st_size=24534300, ...}) = 0&lt;br/&gt;
stat(&quot;./global_fcst&quot;, 0x7fffb2d95eb0)   = -1 ENOENT (No such file or directory)&lt;br/&gt;
open(&quot;/scratch/short/jimj/s4_1534/stmp_2013062412_gdas_fcst1/global_fcst&quot;, O_RDONLY) = 3&lt;br/&gt;
fstat(3, {st_mode=S_IFREG|0755, st_size=24534300, ...}
&lt;p&gt;) = 0&lt;br/&gt;
open(&quot;./global_fcst&quot;, O_WRONLY|O_CREAT|O_EXCL, 0755) = 4&lt;br/&gt;
fstat(4, &lt;/p&gt;
{st_mode=S_IFREG|0755, st_size=0, ...}
&lt;p&gt;) = 0&lt;br/&gt;
mmap(NULL, 4202496, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b5b721c5000&lt;br/&gt;
read(3,&lt;br/&gt;
===============&lt;/p&gt;

&lt;p&gt;At that point, it hangs indefinitely. &lt;/p&gt;

&lt;p&gt;For us, this may be a blocking issue. We&apos;re not sure where exactly the blame lies, but it does seem suspect that the systems cannot flush bad pages or read files other lustre clients can read. We can&apos;t go into production with this hardware until we resolve this.&lt;/p&gt;</comment>
                            <comment id="82848" author="jstroik" created="Wed, 30 Apr 2014 14:15:38 +0000"  >&lt;p&gt;We left the clients in their broken state over night. Same deal. Then, we tested deleting the file from a bad actor and observed this:&lt;/p&gt;


&lt;p&gt;April 29 @ 17:43&lt;br/&gt;
172.17.1.251 - &lt;span class=&quot;error&quot;&gt;&amp;#91;0x200001435:0xad:0x0&amp;#93;&lt;/span&gt; - /scratch/short/jimj/s4_1534/stmp_2013062412_gdas_fcst1/global_fcst&lt;br/&gt;
172.17.1.251 - &lt;span class=&quot;error&quot;&gt;&amp;#91;0x200001435:0x12e:0x0&amp;#93;&lt;/span&gt; - &lt;br/&gt;
/scratch/short/jimj/s4_1534/stmp_2013062412_gdas_fcst1/gfs_namelist&lt;/p&gt;

&lt;p&gt;(rm the file)&lt;/p&gt;

&lt;p&gt;April 30 @ 13:35&lt;br/&gt;
172.17.1.251 - &lt;span class=&quot;error&quot;&gt;&amp;#91;0x200001435:0xad:0x0&amp;#93;&lt;/span&gt; - No file or directory&lt;/p&gt;


&lt;p&gt;The file handle is still stuck open on all nodes where it was unreadable. The stuck nodes cannot read the file, but they can delete it. However, even deleting it doesn&apos;t release their file handle.&lt;/p&gt;</comment>
                            <comment id="83392" author="jstroik" created="Wed, 7 May 2014 14:31:14 +0000"  >&lt;p&gt;I&apos;m not confident that our issue, which I suspect is related to a bug in the lustre client, has anything to do with the error &apos;binary changed...&apos;.&lt;/p&gt;

&lt;p&gt;We did, however, revert our clients to 2.1.6 (still running server 2.5.1) and we no longer observe the same hanging file / hanging mount problem.&lt;/p&gt;</comment>
                            <comment id="85092" author="bobijam" created="Thu, 29 May 2014 01:05:54 +0000"  >&lt;p&gt;Is it easy to reproduce? Can you collect -1 logs with as simple as possible reproduce procedure and upload the logs? Thank you.&lt;/p&gt;</comment>
                            <comment id="85108" author="bobijam" created="Thu, 29 May 2014 08:15:38 +0000"  >&lt;p&gt;Also would you please trying this patch &lt;a href=&quot;http://review.whamcloud.com/10483&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10483&lt;/a&gt; ?&lt;/p&gt;</comment>
                            <comment id="88962" author="lflis" created="Mon, 14 Jul 2014 17:57:11 +0000"  >&lt;p&gt;Is there a patch for client 2.5.1?&lt;br/&gt;
We are observing the same issues in Cyfronet with 2.5.1 client and mpi jobs&lt;/p&gt;

&lt;p&gt;Jul 14 19:24:56 n1043-amd kernel: Lustre: 21837:0:(vvp_io.c:692:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20d545086:0x191e4:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;br/&gt;
Jul 14 19:24:56 n1043-amd kernel: Lustre: 21837:0:(vvp_io.c:692:vvp_io_fault_start()) Skipped 54 previous similar messages&lt;br/&gt;
Jul 14 19:34:59 n1043-amd kernel: Lustre: 21813:0:(vvp_io.c:692:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20d545086:0x191e4:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;br/&gt;
Jul 14 19:34:59 n1043-amd kernel: Lustre: 21812:0:(vvp_io.c:692:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20d545086:0x191e4:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;br/&gt;
Jul 14 19:34:59 n1043-amd kernel: Lustre: 21812:0:(vvp_io.c:692:vvp_io_fault_start()) Skipped 75 previous similar messages&lt;br/&gt;
Jul 14 19:34:59 n1043-amd kernel: Lustre: 21813:0:(vvp_io.c:692:vvp_io_fault_start()) Skipped 75 previous similar messages&lt;br/&gt;
Jul 14 19:36:47 n1043-amd kernel: LustreError: 11-0: scratch-MDT0000-mdc-ffff882834414c00: Communicating with 172.16.193.1@o2ib, operation mds_get_info failed with -1119251304.&lt;br/&gt;
Jul 14 19:36:47 n1043-amd kernel: LustreError: Skipped 4 previous similar messages&lt;br/&gt;
Jul 14 19:45:00 n1043-amd kernel: Lustre: 21825:0:(vvp_io.c:692:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20d545086:0x191e4:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;br/&gt;
Jul 14 19:45:00 n1043-amd kernel: Lustre: 21825:0:(vvp_io.c:692:vvp_io_fault_start()) Skipped 71 previous similar messages&lt;br/&gt;
Jul 14 19:55:02 n1043-amd kernel: Lustre: 21828:0:(vvp_io.c:692:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20d545086:0x191e4:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;br/&gt;
Jul 14 19:55:02 n1043-amd kernel: Lustre: 21828:0:(vvp_io.c:692:vvp_io_fault_start()) Skipped 375 previous similar messages&lt;/p&gt;</comment>
                            <comment id="88966" author="lflis" created="Mon, 14 Jul 2014 18:13:14 +0000"  >&lt;p&gt;@Zhenyu Xu: logs related to the following object have been uploaded to ftp:&lt;br/&gt;
Jul 14 19:55:02 n1043-amd kernel: Lustre: 21828:0:(vvp_io.c:692:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20d545086:0x191e4:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;/p&gt;

&lt;p&gt;Please find the logs here: &lt;a href=&quot;ftp://ftp.whamcloud.com/uploads/lu-4308.cyfronet.log.gz&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ftp://ftp.whamcloud.com/uploads/lu-4308.cyfronet.log.gz&lt;/a&gt; &lt;/p&gt;</comment>
                            <comment id="89000" author="bobijam" created="Tue, 15 Jul 2014 03:17:45 +0000"  >&lt;p&gt;patch for b2_5 branch &lt;a href=&quot;http://review.whamcloud.com/11098&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/11098&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="89178" author="parsonsa@bit-sys.com" created="Wed, 16 Jul 2014 02:15:39 +0000"  >&lt;p&gt;We&apos;ve been running 2.5.1 client packages with this patch included in two separate environments for the past month.  It has eliminated these error messages from occurring.&lt;/p&gt;</comment>
                            <comment id="91598" author="tomtervo" created="Thu, 14 Aug 2014 06:57:13 +0000"  >&lt;p&gt;I applied patch to 2.5.2 client but problem persists. It was VASP MPI jobs which triggered this error.&lt;/p&gt;

&lt;p&gt;Lustre: 62291:0:(vvp_io.c:692:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x201c53daa:0x1d27:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;br/&gt;
Lustre: 62291:0:(vvp_io.c:692:vvp_io_fault_start()) Skipped 1 previous similar message&lt;br/&gt;
Lustre: 62618:0:(vvp_io.c:692:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x201c53daa:0x1d27:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;/p&gt;</comment>
                            <comment id="91617" author="pjones" created="Thu, 14 Aug 2014 16:27:26 +0000"  >&lt;p&gt;The patch that has been reportedly successful at a couple of sites has been landed for 2.5.3. It is not believed that this same issue exists on 2.6 or newer releases so equivalent changes are not needed on master. If there are still residual issues affecting 2.5.x releases then please open a new ticket to track those - thanks!&lt;/p&gt;</comment>
                            <comment id="92153" author="jstroik" created="Thu, 21 Aug 2014 17:59:40 +0000"  >&lt;p&gt;I wanted to report back on our issue because it may be related.&lt;/p&gt;

&lt;p&gt;Our original observation was that some of our Lustre clients would deadlock when running MPI+OpenMP executables. The executable on those clients could only be partially read, and would hang indefinitely on copy or access to /proc/&amp;lt;pid&amp;gt;/exe or /proc/&amp;lt;pid&amp;gt;/cmdline.&lt;/p&gt;

&lt;p&gt;We upgraded to 2.5.2 and applied the aforementioned patch and observed no change.&lt;/p&gt;

&lt;p&gt;We did test the following workarounds some of which were entirely successful:&lt;/p&gt;

&lt;p&gt;(1) enabling &apos;localflock&apos; for those clients as a mount flag was completely successful.&lt;/p&gt;

&lt;p&gt;(2) hosting the executable on an NFS mount was completely successful.&lt;/p&gt;

&lt;p&gt;(3) upgrading to lustre 2.6.0 was completely successful.&lt;/p&gt;

&lt;p&gt;(4) running lustre 2.1.6 with khugepaged disabled mitigated the issue to a large extent, with rare observed deadlocks.&lt;/p&gt;

&lt;p&gt;(5) we tried running on another lustre file system (lustre 2.4.2 servers running ZFS instead of lustre 2.5.1 running ldiskfs) but did not notice any improvement to the client deadlock issue.&lt;/p&gt;

&lt;p&gt;We settled on running lustre 2.6.0 on the clients because we also observed a performance increase when using it.&lt;/p&gt;

&lt;p&gt;NOTE: this issue may have been related to an irregularity we observed in slurmd and have found a work around for as well.&lt;/p&gt;</comment>
                            <comment id="140343" author="kjstrosahl" created="Thu, 28 Jan 2016 14:12:19 +0000"  >&lt;p&gt;We are seeing this issue on clients running 2.5.3&lt;/p&gt;

&lt;p&gt;vvp_io.c:694:vvp_io_fault_start()) binary &lt;span class=&quot;error&quot;&gt;&amp;#91;0x20000aaa6:0x1d668:0x0&amp;#93;&lt;/span&gt; changed while waiting for the page fault lock&lt;/p&gt;

&lt;p&gt;It manifested while the file system was under high load.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="32278">LU-7198</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwa1z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>11800</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10023"><![CDATA[4]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>