Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • Lustre 1.8.8
    • RHEL 6.2
    • 3
    • 5933

    Description

      End customer is Lunds University, who has support for our hardware and lustre through us. I have done some basic troubleshooting with them, had them try rm -f, and unlink, but they cannot perform any operations on the files. I suggested the run a file system check, but they cannot take the filesystem down for that. They realize they may have hit a bug, and upgrading could fix the issue, but they would really like to find a way to remove the files right now without going through that yet. I'll attach the logs I have, and also below you can find the last email I have from the customer, which provides a good summary for the issue.

      -----From Customer-----

      The problematic files were sockets/pipes defined on a server not using
      Lustre, and rsynced into Lustre from that server. That went just fine.
      The next step we did was copying the files from one part of our Lustre
      fs to another, and thereby acquiring proper ACLs. This has worked fine
      for all normal files, directories and links, but these sockets have
      turned into something broken, that we can't remove.

      A little googling brought this up:

      http://jira.whamcloud.com/browse/LU-784

      As we're on 1.8.8, it seems very similar to our problem. What I'd like
      to know, is if there's someone from WhamCloud (or Intel, these days...)
      that can give us any hints on what we can do, other than upgrading to
      Lustre 2.X? I'm guessing there are ways to clear the faulty inodes
      directly from the MDS and/or OSTs, but I'd need some guidance for that.
      We'd really like to have this fixed before talking about a Lustre upgrade...

      Attachments

        Issue Links

          Activity

            [LU-2518] Corrupted files

            Ok thank's, so closing as resolved !!

            bfaccini Bruno Faccini (Inactive) added a comment - Ok thank's, so closing as resolved !!

            Maybe I should have stated that more clearly in my last comment...

            I'm fine with closing this ticket - our problem has been solved!

            Thanks,

            /Mattias

            ludc-mbo Mattias Borell (Inactive) added a comment - Maybe I should have stated that more clearly in my last comment... I'm fine with closing this ticket - our problem has been solved! Thanks, /Mattias

            Mattias, Chris,
            Do you think we can close this ticket now ?
            Thank's again and in advance for your help and answers.
            Best Regards.
            Bruno.

            bfaccini Bruno Faccini (Inactive) added a comment - Mattias, Chris, Do you think we can close this ticket now ? Thank's again and in advance for your help and answers. Best Regards. Bruno.

            Hi!

            OK, we've finally been able to execute the "live fix", and it seems to have worked just fine!

            Thanks a bundle - your response was quick, tested and accurate, couldn't really ask for more.

            /Mattias, now back at planning for expansion & upgrade of our Lustre setup... :->

            ludc-mbo Mattias Borell (Inactive) added a comment - Hi! OK, we've finally been able to execute the "live fix", and it seems to have worked just fine! Thanks a bundle - your response was quick, tested and accurate, couldn't really ask for more. /Mattias, now back at planning for expansion & upgrade of our Lustre setup... :->

            Hi!

            We're still not done with our backups, but as soon as we're satisfied that we have it all on tape (I'm tempted to add more tapedrives... ) we'll try the live fix. Should happen later this week, me thinks. I'll let you know how it works out.

            /Mattias, Lund University

            ludc-mbo Mattias Borell (Inactive) added a comment - Hi! We're still not done with our backups, but as soon as we're satisfied that we have it all on tape (I'm tempted to add more tapedrives... ) we'll try the live fix. Should happen later this week, me thinks. I'll let you know how it works out. /Mattias, Lund University

            BTW, I forgot to confirm that build #11616 from patch Set #5 do not show the problem anymore, this in full Clients+Servers 1.8.8 configuration.

            Any news/update from Site/NetApp side ??

            bfaccini Bruno Faccini (Inactive) added a comment - BTW, I forgot to confirm that build #11616 from patch Set #5 do not show the problem anymore, this in full Clients+Servers 1.8.8 configuration. Any news/update from Site/NetApp side ??

            People

              bfaccini Bruno Faccini (Inactive)
              chrislocke Chris Locke (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: