<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:16:16 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1397] ENOENT on open()</title>
                <link>https://jira.whamcloud.com/browse/LU-1397</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I&apos;ve been getting widespread reports that with 2.1 clients users are seeing random ENOENT errors on opens (and maybe stats?).&lt;/p&gt;

&lt;p&gt;Sometimes the file is written, closed, and reopened on the same client node.  But the open will report that the file does not exist.  A few minutes later the file is definitely there, so the problem is transitory.&lt;/p&gt;

&lt;p&gt;We have also had instances of this where the ENOENT occurs on a node other than where the file was created.  One node will create, write, and close the file, and then another will attempt to open it only to get ENOENT.&lt;/p&gt;

&lt;p&gt;Here is an example failure from a simul test:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;09:04:12: Set iteration 4
09:04:12: Running test #0(iter 0): open, shared mode.
09:04:12:       Beginning setup
09:04:12:       Finished setup          (0.001 sec)
09:04:12:       Beginning test
09:04:12: Process 177(hype338): FAILED in simul_open, open failed: No such file or directory
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There tend to not be any obvious messages in the console logs associated with these events.&lt;/p&gt;</description>
                <environment>&lt;a href=&quot;http://github.com/chaos/lustre,&quot;&gt;http://github.com/chaos/lustre,&lt;/a&gt; version 2.1.1-11chaos</environment>
        <key id="14395">LU-1397</key>
            <summary>ENOENT on open()</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="morrone">Christopher Morrone</reporter>
                        <labels>
                    </labels>
                <created>Thu, 10 May 2012 17:46:49 +0000</created>
                <updated>Fri, 7 Feb 2014 07:31:12 +0000</updated>
                            <resolved>Mon, 4 Jun 2012 14:54:41 +0000</resolved>
                                                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="38584" author="adilger" created="Thu, 10 May 2012 18:27:58 +0000"  >&lt;p&gt;It would be useful to determine if this error is being generated internal to the client (e.g. due to dcache revalidation failing for some reason), or if it is being sent from the MDS (e.g. due to a race condition with the directory or OI lookup).&lt;/p&gt;

&lt;p&gt;The major dcache changes for 2.6.38 were not landed in 2.1, so that is fortunately not a possible candidate for problems, but it is still possible that kernel dcache changes have caused potential problems (e.g. if RHEL has back-ported the dcache changes from 2.6.38 to 2.6.32).&lt;/p&gt;

&lt;p&gt;Is it possible to reproduce this problem with at least +vfstrace and +rpctrace debugging enabled, and capture debug logs from the failing client and MDS ASAP after test failure?&lt;/p&gt;

&lt;p&gt;It appears that we are running simul as part of the hyperion testing for ~4 hours at a time without seeing these problems, but these test runs are on 2.2 and not 2.1:&lt;br/&gt;
&lt;a href=&quot;https://maloo.whamcloud.com/sub_tests/39e3e4be-8eaa-11e1-8a98-525400d2bfa6&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/sub_tests/39e3e4be-8eaa-11e1-8a98-525400d2bfa6&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;https://maloo.whamcloud.com/sub_tests/5bd67e30-99b6-11e1-9853-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/sub_tests/5bd67e30-99b6-11e1-9853-52540035b04c&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;https://maloo.whamcloud.com/sub_tests/ee20c0de-9a51-11e1-9853-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/sub_tests/ee20c0de-9a51-11e1-9853-52540035b04c&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A similar problem may have been reported in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1328&quot; title=&quot;Failing customer&amp;#39;s file creation test&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1328&quot;&gt;&lt;del&gt;LU-1328&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="38585" author="morrone" created="Thu, 10 May 2012 18:40:45 +0000"  >&lt;p&gt;I am not sure how to reproduce it reliably yet.  We&apos;re shaking out the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1299&quot; title=&quot;running truncated executable causes spewing of lock debug messages&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1299&quot;&gt;&lt;del&gt;LU-1299&lt;/del&gt;&lt;/a&gt; patch, but after that I want to use our hype cluster to work on reproducing this bug.&lt;/p&gt;</comment>
                            <comment id="38667" author="morrone" created="Fri, 11 May 2012 18:49:39 +0000"  >&lt;p&gt;We probably hit this a couple of times in overnight testing.  It is not often enough to turn on logs and just hope someone sees it soon enough to gather them.  We&apos;ll need some kind of automated log dump trigger.&lt;/p&gt;</comment>
                            <comment id="38697" author="pjones" created="Sat, 12 May 2012 10:38:48 +0000"  >&lt;p&gt;Chris&lt;/p&gt;

&lt;p&gt;In light of Andreas&apos;s comments, I am wondering if you are carrying the LU1234 patch which backports some of the 2.6.38 dcache work&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="38743" author="morrone" created="Mon, 14 May 2012 13:15:48 +0000"  >&lt;p&gt;Yes, we do have that.  It appeared in 2.1.1-5chaos, but didn&apos;t get installed anywhere until fairly recently with 2.1.1-11chaos.  I have been getting reports of transient failed opens from before that patch was installed.  Not that it couldn&apos;t be adding to the problem...&lt;/p&gt;</comment>
                            <comment id="38763" author="morrone" created="Mon, 14 May 2012 16:17:26 +0000"  >&lt;p&gt;We haven&apos;t been able to reproduce this on demand yet.  So far it only hits when we aren&apos;t looking for it.  I think that a debug patch is required here to make progress.&lt;/p&gt;</comment>
                            <comment id="38831" author="laisiyao" created="Tue, 15 May 2012 13:07:14 +0000"  >&lt;p&gt;Chris, the debug patch for b2_1: &lt;a href=&quot;http://review.whamcloud.com/#change,2793&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,2793&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="38842" author="morrone" created="Tue, 15 May 2012 14:21:37 +0000"  >&lt;p&gt;Is that patch going to be safe to run on a production machine?  Genuine failed lookups are also a normal event.  I fear that the console load, especially on the MDS, is going to be a problem.&lt;/p&gt;</comment>
                            <comment id="38867" author="morrone" created="Tue, 15 May 2012 18:10:35 +0000"  >&lt;p&gt;It looks like we may have hit this a few times last night.  I&apos;ll try the patch on our test system and have the same tests rerun.&lt;/p&gt;</comment>
                            <comment id="38883" author="morrone" created="Tue, 15 May 2012 21:30:04 +0000"  >&lt;p&gt;It looks like we got a hit with your patch.&lt;/p&gt;

&lt;p&gt;Note that this was a testing build of lustre that I tagged as &quot;2.1.1-11chaos6morrone&quot;.  See the code here:&lt;/p&gt;

&lt;p&gt;  &lt;a href=&quot;https://github.com/chaos/lustre/tree/2.1.1-11chaos6morrone&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/chaos/lustre/tree/2.1.1-11chaos6morrone&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I just ran an IOR, and at the end of the write phase it tried to stat() one of the files and failed:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;srun -N 120 -n $((120*8)) src/ior -F -e -g -C -o /p/lcrater2/morrone/foo -t 1m -b 256m&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
------    ---------  ---------- ---------  --------   --------   --------   --------   ----
ior ERROR: stat() failed, errno 2, No such file or directory (aiori-POSIX.c:323)
[515] [MPI Abort by user] Aborting Program!
[515:hype292] Abort: MPI_Abort() code: -1, rank 515, MPI Abort by user Aborting program !: at line 97 in file mpid_init.c
srun: mvapich: 2012-05-15T18:12:57: ABORT from MPI rank 515 [on hype292]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and the only error I see is on the MDS from your debug patch:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;May 15 18:11:02 zwicky-mds2 kernel: LustreError: 19865:0:(mdt_open.c:1485:mdt_reint_open()) open [0x13680001:0x53129225:0x0]/(morrone-&amp;gt;[0x2000013a0:0x7d:0x0]) cr_flag=0104200200001 mode=0040700 msg_flag=0x0 failed: -66
May 15 18:14:38 zwicky-mds2 kernel: LustreError: 20279:0:(mdt_open.c:1485:mdt_reint_open()) open [0x13680001:0x53129225:0x0]/(morrone-&amp;gt;[0x2000013a0:0x7d:0x0]) cr_flag=0104200200001 mode=0040700 msg_flag=0x0 failed: -66
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After the IOR failure, I logged into the same node that reported the failure and was able to &quot;ls -l&quot; the file in question.&lt;/p&gt;</comment>
                            <comment id="38886" author="laisiyao" created="Tue, 15 May 2012 22:18:54 +0000"  >&lt;p&gt;I&apos;m a bit confused, ior failed on stat(), but why MDS reported open failure. Could you setup a trigger to dump MDS log upon such error? I&apos;m reviewing relative code now.&lt;/p&gt;</comment>
                            <comment id="38889" author="laisiyao" created="Wed, 16 May 2012 00:55:18 +0000"  >&lt;p&gt;The opened file &apos;morrone&apos; is a directory, so &apos;stat&apos; could trigger opendir, and cr_flag 0104200200001 shows it requests OPEN lock, but it failed with -EREMOTE.&lt;/p&gt;

&lt;p&gt;There is one strange line of code in MDS OPEN lock handling (which set rc to -EREMOTE upon success), I&apos;ll update the patch. Could you run the test again with the new patch?&lt;/p&gt;</comment>
                            <comment id="38893" author="morrone" created="Wed, 16 May 2012 01:43:43 +0000"  >&lt;p&gt;Yes, I can add that to our testing tomorrow.  Although that test does not reliably produce that failure, so I&apos;m not sure how long we&apos;ll need to run before we know anything.&lt;/p&gt;</comment>
                            <comment id="39149" author="morrone" created="Mon, 21 May 2012 16:47:44 +0000"  >&lt;p&gt;I looks like -EREMOTE is no longer showing up, but the patch is not fixing the ENOENT problem.&lt;/p&gt;

&lt;p&gt;I am also not seeing much in the logs from the patch at the times the problems are hitting.&lt;/p&gt;

&lt;p&gt;The only errors that I see are this on the client:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2012-05-20 05:22:00 LustreError: 87891:0:(file.c:685:ll_file_open()) open [0x200497bc4:0xd842:0x0] failed: -21.&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and this on the server:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;May 20 03:31:58 zwicky-mds2 kernel: LustreError: 21103:0:(mdt_open.c:1484:mdt_reint_open()) open [0x2004978f5:0x8d7a:0x0]/(test_dir-&amp;gt;[0x2004978f5:0x8d7c:0x0]) cr_flag=0100000102 mode=0040700 msg_flag=0x0 failed: -21&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But those do not occur near the time of the ENOENT in the application, and are likely normal errors.&lt;/p&gt;</comment>
                            <comment id="39150" author="morrone" created="Mon, 21 May 2012 16:55:58 +0000"  >&lt;p&gt;FYI, this is one of our top bugs now.&lt;/p&gt;</comment>
                            <comment id="39164" author="laisiyao" created="Mon, 21 May 2012 21:54:13 +0000"  >&lt;p&gt;ll_file_open() return -21 (-EISDIR) looks suspicious, and previous test result shows this is a opendir failure. Could you verify &lt;span class=&quot;error&quot;&gt;&amp;#91;0x200497bc4:0xd842:0x0&amp;#93;&lt;/span&gt; is dir? I&apos;ll review related code.&lt;/p&gt;</comment>
                            <comment id="39165" author="morrone" created="Mon, 21 May 2012 22:22:36 +0000"  >&lt;p&gt;Yes, &lt;span class=&quot;error&quot;&gt;&amp;#91;0x200497bc4:0xd842:0x0&amp;#93;&lt;/span&gt; is in fact a directory, which is why I suspect that it is just normal operation and not part of this bug.&lt;/p&gt;</comment>
                            <comment id="39175" author="laisiyao" created="Tue, 22 May 2012 03:35:29 +0000"  >&lt;p&gt;Does it mean the debug patch may not catch the -ENOENT error? If so, I need update the debug patch, but firstly I want to know is it possible to set a trigger to dump client and MDS debug log upon this failure? BTW, it&apos;s best to turn on &apos;info&apos; and &apos;vfstrace&apos; trace if debuglog can be dumped.&lt;/p&gt;</comment>
                            <comment id="39252" author="morrone" created="Tue, 22 May 2012 18:53:49 +0000"  >&lt;p&gt;That is correct.  I have not yet seen -ENOENT in the logs on a client where the problem has occurred, or on the server.&lt;/p&gt;

&lt;p&gt;I don&apos;t have the time to set things up myself.  I am going to assign Prakash to help on this ticket.  If you can design the trigger yourself and make it easy for us to run, there are more people on our side that can help with that.&lt;/p&gt;</comment>
                            <comment id="39255" author="prakash" created="Tue, 22 May 2012 20:11:23 +0000"  >&lt;p&gt;Lai, Is there an easy way to trigger a dump from within Lustre? If we can narrow down a debug patch that will only trigger when this bug happens (i.e. not on a normal error), perhaps we can add an LBUG and dump the logs that way?&lt;/p&gt;</comment>
                            <comment id="39256" author="nedbass" created="Tue, 22 May 2012 20:40:31 +0000"  >&lt;p&gt;You could just do something like this to dump a stacktrace and the debug log.&lt;/p&gt;

&lt;p&gt;libcfs_debug_dumpstack(NULL);&lt;br/&gt;
libcfs_debug_dumplog();&lt;/p&gt;</comment>
                            <comment id="39258" author="laisiyao" created="Tue, 22 May 2012 21:29:01 +0000"  >&lt;p&gt;Are you using IOR to reproduce this? If so, IMHO it&apos;s easier to modify IOR code to trigger debuglog dump by execute a command line `lctl dk ...` upon error in IOR operation. I&apos;ll try to do it later, but I don&apos;t know the IOR version you&apos;re using, could you give a link of the IOR you&apos;re using?&lt;/p&gt;</comment>
                            <comment id="39261" author="nedbass" created="Tue, 22 May 2012 22:48:36 +0000"  >&lt;p&gt;&lt;a href=&quot;https://github.com/chaos/ior&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/chaos/ior&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="39263" author="laisiyao" created="Tue, 22 May 2012 23:58:05 +0000"  >&lt;p&gt;Patch to trigger lustre debuglog dump upon IOR failure.&lt;/p&gt;</comment>
                            <comment id="39264" author="laisiyao" created="Tue, 22 May 2012 23:59:35 +0000"  >&lt;p&gt;I&apos;ve uploaded the patch to trigger lustre debuglog dump upon IOR failure, but you need update the code for actual MDS name or address.&lt;/p&gt;</comment>
                            <comment id="39279" author="prakash" created="Wed, 23 May 2012 13:05:28 +0000"  >&lt;p&gt;Thanks Lai. I&apos;ll try this approach and give an update if it triggers with the patch applied.&lt;/p&gt;</comment>
                            <comment id="39281" author="morrone" created="Wed, 23 May 2012 13:55:46 +0000"  >&lt;blockquote&gt;&lt;p&gt;Are you using IOR to reproduce this?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;No.  We are using a suite of applications, any one of which might hit the ENOENT.  IOR might be one of them, but if we only patch that we may have to wait a week between hits.&lt;/p&gt;</comment>
                            <comment id="39301" author="prakash" created="Wed, 23 May 2012 18:44:19 +0000"  >&lt;p&gt;Can somebody help me get my bearings in the lustre client code and help me determine the code path where ENOENT would be returned from ll_file_open (or any other related functions)?&lt;/p&gt;

&lt;p&gt;I&apos;m curious if this error is actually coming from the kernel without ever getting to the lustre client (perhaps through an issue with the dcache?), or if it is indeed being returned by the lustre client.&lt;/p&gt;

&lt;p&gt;Lai, I&apos;m running the a patched IOR (&lt;a href=&quot;https://github.com/Prakash-Surya/ior/commits/LU-1397&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/Prakash-Surya/ior/commits/LU-1397&lt;/a&gt;) although as Chris mentioned, it will probably take much time before it triggers on this specific application. A system level trigger would be preferable, although I&apos;m not sure how to filter out the normal ENOENT cases from our specific buggy instance.&lt;/p&gt;</comment>
                            <comment id="39315" author="laisiyao" created="Thu, 24 May 2012 04:48:24 +0000"  >&lt;p&gt;Prakash, I&apos;m also suspicious that ENOENT may happen in directory pathname lookup, or some issue in CWD handling in lustre .&lt;/p&gt;

&lt;p&gt;Could system-tap run on your lustre client? If yes, I&apos;ll try to make a script to check __link_path_walk() return value, if it&apos;s -ENOENT, then it&apos;s time to trigger lustre debuglog dump.&lt;/p&gt;</comment>
                            <comment id="39337" author="prakash" created="Thu, 24 May 2012 11:12:10 +0000"  >&lt;p&gt;Yes, system-tap is definitely an option. I wrote a simple system-tap script and installed in on the clients late yesterday to probe ll_file_open and do_sys_open, but haven&apos;t gotten a hit yet. I&apos;ll upload the script for completeness.&lt;/p&gt;</comment>
                            <comment id="39338" author="prakash" created="Thu, 24 May 2012 11:14:02 +0000"  >&lt;p&gt;I installed this system-tap script on some of the clients.&lt;/p&gt;</comment>
                            <comment id="39383" author="laisiyao" created="Fri, 25 May 2012 03:06:51 +0000"  >&lt;p&gt;Please enable more debug trace: `lctl set_param debug=+&quot;dlmtrace dentry inode&quot;` on client if possible.&lt;/p&gt;</comment>
                            <comment id="39416" author="prakash" created="Fri, 25 May 2012 13:35:25 +0000"  >&lt;p&gt;I fixed some issues with the previous script, and now have this running on the clients.&lt;/p&gt;

&lt;p&gt;It&apos;s probing sys_open, and ll_file_open. It will dump lustre client logs when those return with errors.&lt;/p&gt;

&lt;p&gt;It also adds rpctrace, vfstrace, dlmtrace, dentry, and inode debugging levels for lustre. Hopefully when we hit the issue with this installed, we&apos;ll be able to get some useful information out of the system.&lt;/p&gt;</comment>
                            <comment id="39417" author="prakash" created="Fri, 25 May 2012 13:39:35 +0000"  >&lt;p&gt;In order to probe the path walk function, I need to find a way to limit it to only the lustre filesystem. The &quot;name&quot; variable is undefined when I probe it, so I can&apos;t simply strncmp it&apos;s prefix as I do in the sys_open case. ENOENT is much too common a case to handle without filtering it to only Lustre.&lt;/p&gt;</comment>
                            <comment id="39425" author="prakash" created="Fri, 25 May 2012 17:50:53 +0000"  >&lt;p&gt;Lai, I think I got a hit with the v2 system tap script in place (I&apos;m still waiting to verify with the user):&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;May 25 14:29:18 hype336 Lustre: LU-1397 (1337981358181): sys_open: rc = -2, filename = /p/lcrater2/swltest/SWL/Hype2/IO/37988/test_dir/miranda_io.out.10018
May 25 14:29:18 hype336 Lustre: LU-1397 (1337981358181): Dumping llog to /tmp/lu1397-1337981358181.llog
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So, sys_open returned -ENOENT, but ll_file_open did &lt;b&gt;not&lt;/b&gt; report an error. I assume the kernel returned the error without ever getting to the lustre code, but I can&apos;t say for sure as I don&apos;t understand the code path well enough.&lt;/p&gt;</comment>
                            <comment id="39426" author="prakash" created="Fri, 25 May 2012 17:53:48 +0000"  >&lt;p&gt;Regarding my previous comment, here is the client lustre log file dumped by the stap script.&lt;/p&gt;</comment>
                            <comment id="39427" author="prakash" created="Fri, 25 May 2012 18:35:31 +0000"  >&lt;p&gt;I talked to the user, and the job that triggered the error was on it&apos;s first iteration in which it was opening and creating new files to write to. &apos;ls&apos;-ing the directory shows that the file is indeed missing. My current guess is that an error in the pathname lookup is the root cause.&lt;/p&gt;</comment>
                            <comment id="39433" author="yong.fan" created="Fri, 25 May 2012 20:38:12 +0000"  >&lt;p&gt;Lai, there are some clew in the logs from Parkash:&lt;/p&gt;

&lt;p&gt;=============&lt;br/&gt;
00000080:00200000:2.0:1337981358.181813:0:54766:0:(file.c:2465:ll_inode_permission()) VFS Op:inode=144237863481960082/33582994(ffff8803b184d838), inode mode 41c0 mask 100000080:00002000:2.0:1337981358.181814:0:54766:0:(dcache.c:103:ll_dcompare()) found name test_dir(ffff8804245c3800) - flags 0/0 - refc 1&lt;br/&gt;
=============&lt;/p&gt;

&lt;p&gt;The thread &quot;54766&quot; is the just the thread to lookup &quot;/p/lcrater2/swltest/SWL/Hype2/IO/37988/test_dir/miranda_io.out.10018&quot;. But when it tried to find the parent &quot;test_dir&quot;, it got an invalid dentry &quot;ffff8804245c3800&quot;, although it was not marked as &quot;DCACHE_LUSTRE_INVALID&quot;. Because the valid &quot;test_dir&quot; should be &quot;ffff8804302af900&quot;. So the the &quot;d_inode&quot; for such invalid dentry &quot;ffff8804245c3800&quot; should be NULL, then VFS path parse failed at:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;&lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; __link_path_walk(&lt;span class=&quot;code-keyword&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;code-object&quot;&gt;char&lt;/span&gt; *name, struct nameidata *nd)
{
...
                &lt;span class=&quot;code-comment&quot;&gt;/* This does the actual lookups.. */&lt;/span&gt;
                err = do_lookup(nd, &amp;amp;&lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt;, &amp;amp;next);
                &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (err)
                        &lt;span class=&quot;code-keyword&quot;&gt;break&lt;/span&gt;;

                err = -ENOENT;
                inode = next.dentry-&amp;gt;d_inode;
                &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (!inode)
                        &lt;span class=&quot;code-keyword&quot;&gt;goto&lt;/span&gt; out_dput;
...
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That may be why Lustre did &quot;NOT&quot; report &quot;ENOENT&quot;, but VFS did. We can follow this clew for further investigation.&lt;/p&gt;</comment>
                            <comment id="39439" author="yong.fan" created="Sun, 27 May 2012 01:00:30 +0000"  >&lt;p&gt;After more investigation, I think that the operation sequences on the client when failure occurred were as following:&lt;/p&gt;

&lt;p&gt;1) The thread &quot;54770&quot; was one IOR thread which was trying to parsing &quot;/p/lcrater2/swltest/SWL/Hype2/IO/37988/test_dir/miranda_io.out.10022&quot;. Before processed the last component &quot;miranda_io.out.10022&quot;, it needed to lookup the &quot;test_dir&quot; firstly.&lt;/p&gt;

&lt;p&gt;2) The dentry &quot;ffff8804302af900&quot; for the &quot;test_dir&quot; existed in memory at that time, but was marked as &quot;DCACHE_LUSTRE_INVALID&quot;. So &quot;ll_compare()&quot; against such dentry &quot;ffff8804302af900&quot; returned unmatched for the thread &quot;54770&quot;. So the thread &quot;54770&quot; created new negative dentry &quot;ffff8804245c3800&quot; for the &quot;test_dir&quot; (reverse guess from the result), and triggered real lookup:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;00000080:00002000:6.0:1337981358.181442:0:54770:0:(dcache.c:103:ll_dcompare()) found name test_dir(ffff8804302af900) - flags 0/4000000 - refc 25
00000080:00200000:6.0:1337981358.181444:0:54770:0:(namei.c:454:ll_lookup_it()) VFS Op:name=test_dir,dir=144237863481960082/33582994(ffff8803b184d838),intent=0
...
00800000:00000002:6.0:1337981358.181449:0:54770:0:(lmv_intent.c:405:lmv_intent_lookup()) LOOKUP_INTENT with fid1=[0x2006f9298:0xe692:0x0], fid2=[0x0:0x0:0x0], name=&lt;span class=&quot;code-quote&quot;&gt;&apos;test_dir&apos;&lt;/span&gt; -&amp;gt; mds #0
00000002:00010000:6.0:1337981358.181450:0:54770:0:(mdc_locks.c:917:mdc_intent_lock()) (name: test_dir,[0x0:0x0:0x0]) in obj [0x2006f9298:0xe692:0x0], intent: lookup flags 00
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;3) The thread &quot;54770&quot; succeeded to re-obtain related ibits lock from MDS, and tried to update inode and dentry for the &quot;test_dir&quot; on the client:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;00000002:00002000:6.0:1337981358.181799:0:54770:0:(mdc_locks.c:844:mdc_finish_intent_lock()) D_IT dentry test_dir intent: lookup status 0 disp b rc 0
...
00000080:00200000:6.0:1337981358.181809:0:54770:0:(namei.c:151:ll_iget()) got inode: ffff8803b184d3f8 &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; [0x2006f9298:0xe693:0x0]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;4) When the thread &quot;54770&quot; was in the function &quot;ll_splice_alias()&quot; to process the &quot;test_dir&quot;, another thread &quot;54766&quot; (the failure thread as described above) began to lookup the &quot;test_dir&quot; for parsing &quot;/p/lcrater2/swltest/SWL/Hype2/IO/37988/test_dir/miranda_io.out.10018&quot;.&lt;/p&gt;

&lt;p&gt;5) There was race condition between the thread &quot;54770&quot; and the thread &quot;54766&quot; for the dentry &quot;test_dir&quot;:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;struct dentry *ll_splice_alias(struct inode *inode, struct dentry *de)
{
        struct dentry *&lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;;

        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (inode) {
               &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt; = ll_find_alias(inode, de);
               &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (&lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;) {
                      ll_dops_init(&lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;, 1, 1);
(5.1)===&amp;gt;                      d_rehash(de);
(5.3)===&amp;gt;                      d_move(&lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;, de);
                      iput(inode);
                      CDEBUG(D_DENTRY,
                             &lt;span class=&quot;code-quote&quot;&gt;&quot;Reuse dentry %p inode %p refc %d flags %#x\n&quot;&lt;/span&gt;,
                             &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;, &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;-&amp;gt;d_inode, d_refcount(&lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;), &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;-&amp;gt;d_flags);
                      &lt;span class=&quot;code-keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;code-keyword&quot;&gt;new&lt;/span&gt;;
               }
        }
...
}


&lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt; ll_dcompare(struct dentry *parent, struct qstr *d_name, struct qstr *name)
{
...
(5.2)===&amp;gt;        CDEBUG(D_DENTRY,&lt;span class=&quot;code-quote&quot;&gt;&quot;found name %.*s(%p) - flags %d/%x - refc %d\n&quot;&lt;/span&gt;,
               name-&amp;gt;len, name-&amp;gt;name, dchild,
               d_mountpoint(dchild), dchild-&amp;gt;d_flags &amp;amp; DCACHE_LUSTRE_INVALID,
               atomic_read(&amp;amp;dchild-&amp;gt;d_count));
...
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;5.1) The thread &quot;54770&quot; found the former dentry &quot;ffff8804302af900&quot;, and tried to reuse &quot;ffff8804302af900&quot;. But firstly, it rehashed the negative dentry &quot;ffff8804245c3800&quot; before established the inode alias.&lt;/p&gt;

&lt;p&gt;5.2) The thread &quot;54766&quot; found the negative dentry &quot;ffff8804245c3800&quot; in &quot;__d_lookup()&quot;, which was just rehashed by the thread &quot;54770&quot;. Since the negative dentry &quot;ffff8804245c3800&quot; is not marked as &quot;DCACHE_LUSTRE_INVALID&quot;, the thread &quot;54766&quot; regarded it as valid entry for the &quot;test_dir&quot;.&lt;/p&gt;

&lt;p&gt;5.3) The thread &quot;54770&quot; called &quot;d_move()&quot; to establish relative inode alias. After that, the exported dentry for &quot;test_dir&quot; became the valid &quot;ffff8804302af900&quot;.&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;00000080:00010000:6.0:1337981358.181811:0:54770:0:(namei.c:410:ll_lookup_it_finish()) setting l_data to inode ffff8803b184d3f8 (144237863481960083/33582994)
...
00000080:00002000:2.0:1337981358.181812:0:54766:0:(dcache.c:103:ll_dcompare()) found name 37988(ffff8804302af540) - flags 0/0 - refc 6
00000080:00200000:2.0:1337981358.181812:0:54766:0:(dcache.c:359:ll_revalidate_it()) VFS Op:name=37988,intent=0
00000080:00200000:2.0:1337981358.181813:0:54766:0:(file.c:2465:ll_inode_permission()) VFS Op:inode=144237863481960082/33582994(ffff8803b184d838), inode mode 41c0 mask 1
00000080:00002000:2.0:1337981358.181814:0:54766:0:(dcache.c:103:ll_dcompare()) found name test_dir(ffff8804245c3800) - flags 0/0 - refc 1
00000080:00002000:6.0:1337981358.181815:0:54770:0:(namei.c:378:ll_splice_alias()) Reuse dentry ffff8804302af900 inode ffff8803b184d3f8 refc 26 flags 0x4000000
00000080:00010000:6.0:1337981358.181817:0:54770:0:(dcache.c:324:ll_lookup_finish_locks()) setting l_data to inode ffff8803b184d3f8 (144237863481960083/33582994)
00000080:00010000:6.0:1337981358.181818:0:54770:0:(dcache.c:233:ll_intent_drop_lock()) releasing lock with cookie 0x829fa7d835b2db86 from it ffff880431f49b98
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;So here, we should not make the negative dentry &quot;ffff8804245c3800&quot; to be visible to others in &quot;ll_splice_alias()&quot;. Either remove &quot;rehash&quot; or mark the dentry as &quot;DCACHE_LUSTRE_INVALID&quot;. So need more suitable processing for that.&lt;/p&gt;</comment>
                            <comment id="39441" author="laisiyao" created="Sun, 27 May 2012 20:32:19 +0000"  >&lt;p&gt;Yong, you&apos;re right, I think it&apos;s better to remove d_rehash(), this looks to be a bug from old kernel (from d_splice_alias(), and this line is removed in latest code).&lt;/p&gt;

&lt;p&gt;I&apos;ll commit a patch later.&lt;/p&gt;</comment>
                            <comment id="39444" author="laisiyao" created="Sun, 27 May 2012 21:57:59 +0000"  >&lt;p&gt;Patch is updated: &lt;a href=&quot;http://review.whamcloud.com/#change,2400&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,2400&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Please revert previous patch in your code base and patch against this one.&lt;/p&gt;</comment>
                            <comment id="39507" author="prakash" created="Tue, 29 May 2012 11:49:06 +0000"  >&lt;p&gt;Lai, was this issue introduced by one of the earlier versions of that patch? I ask because we have a vague recollection that we saw ENOENT issues prior to applying it, albeit less frequently than we currently do. So, if this specific case was introduced (and now fixed?) by &lt;a href=&quot;http://review.whamcloud.com/2400&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/2400&lt;/a&gt;, there still may be other issues lurking.&lt;/p&gt;

&lt;p&gt;Either way, I am going to try and get the new revision of the patch applied and tested today.&lt;/p&gt;

&lt;p&gt;And thanks for the detailed explanation Fan Yong! It&apos;s &lt;b&gt;very&lt;/b&gt; helpful.&lt;/p&gt;</comment>
                            <comment id="39556" author="laisiyao" created="Tue, 29 May 2012 21:29:23 +0000"  >&lt;p&gt;Prakash, yes, the ENOENT failure found this time is introduced by the earlier patch for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1234&quot; title=&quot;Executing binary stored on Lustre results in &amp;quot; (deleted)&amp;quot; appended to /proc/self/exec&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1234&quot;&gt;&lt;del&gt;LU-1234&lt;/del&gt;&lt;/a&gt;. It&apos;s best to enable debuglog trigger as well in your verification test, thanks.&lt;/p&gt;</comment>
                            <comment id="39586" author="prakash" created="Wed, 30 May 2012 13:15:09 +0000"  >&lt;p&gt;Ok, Thanks Lai. We should start testing with the new patch set in place today.&lt;/p&gt;</comment>
                            <comment id="39935" author="prakash" created="Mon, 4 Jun 2012 13:13:55 +0000"  >&lt;p&gt;Thanks for the fix, Lai. We haven&apos;t seen any ENOENT failures in the past few days of testing, with patch set 4 applied. This can be marked resolved.&lt;/p&gt;</comment>
                            <comment id="39943" author="pjones" created="Mon, 4 Jun 2012 14:54:41 +0000"  >&lt;p&gt;Thanks Prakash. We will track landing this code under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-506&quot; title=&quot;FC15  patchless client support.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-506&quot;&gt;&lt;del&gt;LU-506&lt;/del&gt;&lt;/a&gt; so I am closing this ticket as a duplicate of that. As to whether this fix will also address the instances you may have observed prior to applying the intially flawed &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1234&quot; title=&quot;Executing binary stored on Lustre results in &amp;quot; (deleted)&amp;quot; appended to /proc/self/exec&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1234&quot;&gt;&lt;del&gt;LU-1234&lt;/del&gt;&lt;/a&gt; patch, it may well do because the cache mechanism has been altered by this change. &lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="11348">LU-506</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="19782">LU-3579</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="11470" name="hype336-lu1397-1337981358181.llog.gz" size="5876817" author="prakash" created="Fri, 25 May 2012 17:53:48 +0000"/>
                            <attachment id="11442" name="ior-lustre_debug.diff" size="1085" author="laisiyao" created="Tue, 22 May 2012 23:58:05 +0000"/>
                            <attachment id="11469" name="open-v2.stp" size="2078" author="prakash" created="Fri, 25 May 2012 13:35:25 +0000"/>
                            <attachment id="11458" name="open.stp" size="925" author="prakash" created="Thu, 24 May 2012 11:14:02 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvgz3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6399</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>