Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5730

intermittent I/O errors for some directories

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • Lustre 2.5.2
    • None
    • Lustre 2.5.2 on RHEL6 servers and clients, NFS exported, ACLs
    • 3
    • 16090

    Description

      Our users have reported an issue where the suddenly have problems editing a file in a directory, they also got I/O errors for example when trying to get the ACLs for that directory. In at least one instance the problem resolved itself overnight after we decided to investigate in more detail later, in another case today the problem went away when we renamed the problematic directory.

      In the most recent instance today we had some time where we were able to attempt to understand the issue and this is what we found so far: While the problem persists, some clients are seeing I/O error on calling getfacl, other clients don't have any problems running the same commands and returned the expected results. Some machines access this directory over NFS, exported from one of our clients which was showing problems in this instance, they had the same issues. Attempting to edit a file in the problematic directory with vim came up with the message that the .swp file already exists even for new files. Creating new files in the directory, for example with touch, worked with no problem.

      There are no error messages recorded by syslog on any of the machines involved.

      We've mostly run out of ideas what to look for next to resolve this if it happens again.

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              ferner Frederik Ferner (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: