Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • None
    • None
    • http://github.com/chaos/lustre, version 2.1.1-11chaos
    • 3
    • 6399

    Description

      I've been getting widespread reports that with 2.1 clients users are seeing random ENOENT errors on opens (and maybe stats?).

      Sometimes the file is written, closed, and reopened on the same client node. But the open will report that the file does not exist. A few minutes later the file is definitely there, so the problem is transitory.

      We have also had instances of this where the ENOENT occurs on a node other than where the file was created. One node will create, write, and close the file, and then another will attempt to open it only to get ENOENT.

      Here is an example failure from a simul test:

      09:04:12: Set iteration 4
      09:04:12: Running test #0(iter 0): open, shared mode.
      09:04:12:       Beginning setup
      09:04:12:       Finished setup          (0.001 sec)
      09:04:12:       Beginning test
      09:04:12: Process 177(hype338): FAILED in simul_open, open failed: No such file or directory
      

      There tend to not be any obvious messages in the console logs associated with these events.

      Attachments

        1. hype336-lu1397-1337981358181.llog.gz
          5.60 MB
        2. ior-lustre_debug.diff
          1 kB
        3. open.stp
          0.9 kB
        4. open-v2.stp
          2 kB

        Issue Links

          Activity

            [LU-1397] ENOENT on open()
            yong.fan nasf (Inactive) added a comment - - edited

            Lai, there are some clew in the logs from Parkash:

            =============
            00000080:00200000:2.0:1337981358.181813:0:54766:0:(file.c:2465:ll_inode_permission()) VFS Op:inode=144237863481960082/33582994(ffff8803b184d838), inode mode 41c0 mask 100000080:00002000:2.0:1337981358.181814:0:54766:0:(dcache.c:103:ll_dcompare()) found name test_dir(ffff8804245c3800) - flags 0/0 - refc 1
            =============

            The thread "54766" is the just the thread to lookup "/p/lcrater2/swltest/SWL/Hype2/IO/37988/test_dir/miranda_io.out.10018". But when it tried to find the parent "test_dir", it got an invalid dentry "ffff8804245c3800", although it was not marked as "DCACHE_LUSTRE_INVALID". Because the valid "test_dir" should be "ffff8804302af900". So the the "d_inode" for such invalid dentry "ffff8804245c3800" should be NULL, then VFS path parse failed at:

            static int __link_path_walk(const char *name, struct nameidata *nd)
            {
            ...
                            /* This does the actual lookups.. */
                            err = do_lookup(nd, &this, &next);
                            if (err)
                                    break;
            
                            err = -ENOENT;
                            inode = next.dentry->d_inode;
                            if (!inode)
                                    goto out_dput;
            ...
            }
            

            That may be why Lustre did "NOT" report "ENOENT", but VFS did. We can follow this clew for further investigation.

            yong.fan nasf (Inactive) added a comment - - edited Lai, there are some clew in the logs from Parkash: ============= 00000080:00200000:2.0:1337981358.181813:0:54766:0:(file.c:2465:ll_inode_permission()) VFS Op:inode=144237863481960082/33582994(ffff8803b184d838), inode mode 41c0 mask 100000080:00002000:2.0:1337981358.181814:0:54766:0:(dcache.c:103:ll_dcompare()) found name test_dir(ffff8804245c3800) - flags 0/0 - refc 1 ============= The thread "54766" is the just the thread to lookup "/p/lcrater2/swltest/SWL/Hype2/IO/37988/test_dir/miranda_io.out.10018". But when it tried to find the parent "test_dir", it got an invalid dentry "ffff8804245c3800", although it was not marked as "DCACHE_LUSTRE_INVALID". Because the valid "test_dir" should be "ffff8804302af900". So the the "d_inode" for such invalid dentry "ffff8804245c3800" should be NULL, then VFS path parse failed at: static int __link_path_walk( const char *name, struct nameidata *nd) { ... /* This does the actual lookups.. */ err = do_lookup(nd, & this , &next); if (err) break ; err = -ENOENT; inode = next.dentry->d_inode; if (!inode) goto out_dput; ... } That may be why Lustre did "NOT" report "ENOENT", but VFS did. We can follow this clew for further investigation.

            I talked to the user, and the job that triggered the error was on it's first iteration in which it was opening and creating new files to write to. 'ls'-ing the directory shows that the file is indeed missing. My current guess is that an error in the pathname lookup is the root cause.

            prakash Prakash Surya (Inactive) added a comment - I talked to the user, and the job that triggered the error was on it's first iteration in which it was opening and creating new files to write to. 'ls'-ing the directory shows that the file is indeed missing. My current guess is that an error in the pathname lookup is the root cause.

            Regarding my previous comment, here is the client lustre log file dumped by the stap script.

            prakash Prakash Surya (Inactive) added a comment - Regarding my previous comment, here is the client lustre log file dumped by the stap script.
            prakash Prakash Surya (Inactive) added a comment - - edited

            Lai, I think I got a hit with the v2 system tap script in place (I'm still waiting to verify with the user):

            May 25 14:29:18 hype336 Lustre: LU-1397 (1337981358181): sys_open: rc = -2, filename = /p/lcrater2/swltest/SWL/Hype2/IO/37988/test_dir/miranda_io.out.10018
            May 25 14:29:18 hype336 Lustre: LU-1397 (1337981358181): Dumping llog to /tmp/lu1397-1337981358181.llog
            

            So, sys_open returned -ENOENT, but ll_file_open did not report an error. I assume the kernel returned the error without ever getting to the lustre code, but I can't say for sure as I don't understand the code path well enough.

            prakash Prakash Surya (Inactive) added a comment - - edited Lai, I think I got a hit with the v2 system tap script in place (I'm still waiting to verify with the user): May 25 14:29:18 hype336 Lustre: LU-1397 (1337981358181): sys_open: rc = -2, filename = /p/lcrater2/swltest/SWL/Hype2/IO/37988/test_dir/miranda_io.out.10018 May 25 14:29:18 hype336 Lustre: LU-1397 (1337981358181): Dumping llog to /tmp/lu1397-1337981358181.llog So, sys_open returned -ENOENT, but ll_file_open did not report an error. I assume the kernel returned the error without ever getting to the lustre code, but I can't say for sure as I don't understand the code path well enough.

            In order to probe the path walk function, I need to find a way to limit it to only the lustre filesystem. The "name" variable is undefined when I probe it, so I can't simply strncmp it's prefix as I do in the sys_open case. ENOENT is much too common a case to handle without filtering it to only Lustre.

            prakash Prakash Surya (Inactive) added a comment - In order to probe the path walk function, I need to find a way to limit it to only the lustre filesystem. The "name" variable is undefined when I probe it, so I can't simply strncmp it's prefix as I do in the sys_open case. ENOENT is much too common a case to handle without filtering it to only Lustre.

            I fixed some issues with the previous script, and now have this running on the clients.

            It's probing sys_open, and ll_file_open. It will dump lustre client logs when those return with errors.

            It also adds rpctrace, vfstrace, dlmtrace, dentry, and inode debugging levels for lustre. Hopefully when we hit the issue with this installed, we'll be able to get some useful information out of the system.

            prakash Prakash Surya (Inactive) added a comment - I fixed some issues with the previous script, and now have this running on the clients. It's probing sys_open, and ll_file_open. It will dump lustre client logs when those return with errors. It also adds rpctrace, vfstrace, dlmtrace, dentry, and inode debugging levels for lustre. Hopefully when we hit the issue with this installed, we'll be able to get some useful information out of the system.
            laisiyao Lai Siyao added a comment -

            Please enable more debug trace: `lctl set_param debug=+"dlmtrace dentry inode"` on client if possible.

            laisiyao Lai Siyao added a comment - Please enable more debug trace: `lctl set_param debug=+"dlmtrace dentry inode"` on client if possible.

            I installed this system-tap script on some of the clients.

            prakash Prakash Surya (Inactive) added a comment - I installed this system-tap script on some of the clients.

            Yes, system-tap is definitely an option. I wrote a simple system-tap script and installed in on the clients late yesterday to probe ll_file_open and do_sys_open, but haven't gotten a hit yet. I'll upload the script for completeness.

            prakash Prakash Surya (Inactive) added a comment - Yes, system-tap is definitely an option. I wrote a simple system-tap script and installed in on the clients late yesterday to probe ll_file_open and do_sys_open, but haven't gotten a hit yet. I'll upload the script for completeness.
            laisiyao Lai Siyao added a comment -

            Prakash, I'm also suspicious that ENOENT may happen in directory pathname lookup, or some issue in CWD handling in lustre .

            Could system-tap run on your lustre client? If yes, I'll try to make a script to check __link_path_walk() return value, if it's -ENOENT, then it's time to trigger lustre debuglog dump.

            laisiyao Lai Siyao added a comment - Prakash, I'm also suspicious that ENOENT may happen in directory pathname lookup, or some issue in CWD handling in lustre . Could system-tap run on your lustre client? If yes, I'll try to make a script to check __link_path_walk() return value, if it's -ENOENT, then it's time to trigger lustre debuglog dump.

            Can somebody help me get my bearings in the lustre client code and help me determine the code path where ENOENT would be returned from ll_file_open (or any other related functions)?

            I'm curious if this error is actually coming from the kernel without ever getting to the lustre client (perhaps through an issue with the dcache?), or if it is indeed being returned by the lustre client.

            Lai, I'm running the a patched IOR (https://github.com/Prakash-Surya/ior/commits/LU-1397) although as Chris mentioned, it will probably take much time before it triggers on this specific application. A system level trigger would be preferable, although I'm not sure how to filter out the normal ENOENT cases from our specific buggy instance.

            prakash Prakash Surya (Inactive) added a comment - Can somebody help me get my bearings in the lustre client code and help me determine the code path where ENOENT would be returned from ll_file_open (or any other related functions)? I'm curious if this error is actually coming from the kernel without ever getting to the lustre client (perhaps through an issue with the dcache?), or if it is indeed being returned by the lustre client. Lai, I'm running the a patched IOR ( https://github.com/Prakash-Surya/ior/commits/LU-1397 ) although as Chris mentioned, it will probably take much time before it triggers on this specific application. A system level trigger would be preferable, although I'm not sure how to filter out the normal ENOENT cases from our specific buggy instance.

            People

              laisiyao Lai Siyao
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: