Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • Lustre 1.8.8
    • RHEL 6.2
    • 3
    • 5933

    Description

      End customer is Lunds University, who has support for our hardware and lustre through us. I have done some basic troubleshooting with them, had them try rm -f, and unlink, but they cannot perform any operations on the files. I suggested the run a file system check, but they cannot take the filesystem down for that. They realize they may have hit a bug, and upgrading could fix the issue, but they would really like to find a way to remove the files right now without going through that yet. I'll attach the logs I have, and also below you can find the last email I have from the customer, which provides a good summary for the issue.

      -----From Customer-----

      The problematic files were sockets/pipes defined on a server not using
      Lustre, and rsynced into Lustre from that server. That went just fine.
      The next step we did was copying the files from one part of our Lustre
      fs to another, and thereby acquiring proper ACLs. This has worked fine
      for all normal files, directories and links, but these sockets have
      turned into something broken, that we can't remove.

      A little googling brought this up:

      http://jira.whamcloud.com/browse/LU-784

      As we're on 1.8.8, it seems very similar to our problem. What I'd like
      to know, is if there's someone from WhamCloud (or Intel, these days...)
      that can give us any hints on what we can do, other than upgrading to
      Lustre 2.X? I'm guessing there are ways to clear the faulty inodes
      directly from the MDS and/or OSTs, but I'd need some guidance for that.
      We'd really like to have this fixed before talking about a Lustre upgrade...

      Attachments

        Issue Links

          Activity

            [LU-2518] Corrupted files

            Ok thank's, so closing as resolved !!

            bfaccini Bruno Faccini (Inactive) added a comment - Ok thank's, so closing as resolved !!

            Maybe I should have stated that more clearly in my last comment...

            I'm fine with closing this ticket - our problem has been solved!

            Thanks,

            /Mattias

            ludc-mbo Mattias Borell (Inactive) added a comment - Maybe I should have stated that more clearly in my last comment... I'm fine with closing this ticket - our problem has been solved! Thanks, /Mattias

            Mattias, Chris,
            Do you think we can close this ticket now ?
            Thank's again and in advance for your help and answers.
            Best Regards.
            Bruno.

            bfaccini Bruno Faccini (Inactive) added a comment - Mattias, Chris, Do you think we can close this ticket now ? Thank's again and in advance for your help and answers. Best Regards. Bruno.

            Hi!

            OK, we've finally been able to execute the "live fix", and it seems to have worked just fine!

            Thanks a bundle - your response was quick, tested and accurate, couldn't really ask for more.

            /Mattias, now back at planning for expansion & upgrade of our Lustre setup... :->

            ludc-mbo Mattias Borell (Inactive) added a comment - Hi! OK, we've finally been able to execute the "live fix", and it seems to have worked just fine! Thanks a bundle - your response was quick, tested and accurate, couldn't really ask for more. /Mattias, now back at planning for expansion & upgrade of our Lustre setup... :->

            Hi!

            We're still not done with our backups, but as soon as we're satisfied that we have it all on tape (I'm tempted to add more tapedrives... ) we'll try the live fix. Should happen later this week, me thinks. I'll let you know how it works out.

            /Mattias, Lund University

            ludc-mbo Mattias Borell (Inactive) added a comment - Hi! We're still not done with our backups, but as soon as we're satisfied that we have it all on tape (I'm tempted to add more tapedrives... ) we'll try the live fix. Should happen later this week, me thinks. I'll let you know how it works out. /Mattias, Lund University

            BTW, I forgot to confirm that build #11616 from patch Set #5 do not show the problem anymore, this in full Clients+Servers 1.8.8 configuration.

            Any news/update from Site/NetApp side ??

            bfaccini Bruno Faccini (Inactive) added a comment - BTW, I forgot to confirm that build #11616 from patch Set #5 do not show the problem anymore, this in full Clients+Servers 1.8.8 configuration. Any news/update from Site/NetApp side ??

            Hello Mattias,

            To be complete on item 3), let say that even if Zhenyu (who developed the patch) confirms the issue is on the Server/MDS side only and that no regression should exist between Servers running next 1.8 version including the patch and Clients left with 1.8.7, this will be an un-tested environment so there is still the possibility of an hidden issue.

            Prior to join WhamCloud/Intel team I was working for Bull and mainly at a customer site and each time it was decided to go this (unbalanced) way, because their Lustre datas are mission-critical, we have been running our own regression test-suite during hours prior to run with this configuration for production workload.

            But frankly I don't think that the problem you currently face here can justify the work+risk we discuss.

            Hope this helps and clarifies.
            Best Regards.
            Bruno.

            bfaccini Bruno Faccini (Inactive) added a comment - Hello Mattias, To be complete on item 3), let say that even if Zhenyu (who developed the patch) confirms the issue is on the Server/MDS side only and that no regression should exist between Servers running next 1.8 version including the patch and Clients left with 1.8.7, this will be an un-tested environment so there is still the possibility of an hidden issue. Prior to join WhamCloud/Intel team I was working for Bull and mainly at a customer site and each time it was decided to go this (unbalanced) way, because their Lustre datas are mission-critical, we have been running our own regression test-suite during hours prior to run with this configuration for production workload. But frankly I don't think that the problem you currently face here can justify the work+risk we discuss. Hope this helps and clarifies. Best Regards. Bruno.

            Hi Bruno!

            This is the customer speaking...

            1) OK, good to know.

            2) The work-around seems clearer now. I'll have to discuss it with my colleagues as to if/when we can try this. We had a backup-crisis recently (had to classify all used tapes as rubbish, due to mechanical problems), and are slowly resyncing 140-150TB to tape. We'll probably wait at least until that's finished...

            3) OK, so you'd recommend that all machines (MDS, ODS, clients) are patched/updated to the same level? If so, we'll probably wait until we can go for 2.X sometime during spring 2013. Is step 2) necessary to do before an upgrade to 2.X?

            Cheers,

            /Mattias, Lund University

            ludc-mbo Mattias Borell (Inactive) added a comment - Hi Bruno! This is the customer speaking... 1) OK, good to know. 2) The work-around seems clearer now. I'll have to discuss it with my colleagues as to if/when we can try this. We had a backup-crisis recently (had to classify all used tapes as rubbish, due to mechanical problems), and are slowly resyncing 140-150TB to tape. We'll probably wait at least until that's finished... 3) OK, so you'd recommend that all machines (MDS, ODS, clients) are patched/updated to the same level? If so, we'll probably wait until we can go for 2.X sometime during spring 2013. Is step 2) necessary to do before an upgrade to 2.X? Cheers, /Mattias, Lund University

            Hello Chris,

            Your answers are what I expected, so we can easily presume that no more/new named pipes/sockets will be created.

            About customer's questions, here are my answers :

            1) the currently affected files will not cause any other problem than the known error each time they are accessed. In fact, the Inode is not corrupted it is just mis-interpreted by the MDS layer.

            2) the work-around I found to fix the problem is not what we can say a "standard" procedure for fixing Inodes on the MDS side, and particulary if you run it live/in-parallel of the file-system being mounted and used. So if you/users can leave with such files, just wait for filesystem scheduled down-time to apply it. This said, the exact procedure/commands can be described as :

            _ on the running/primary MDS mount the MDT device with ldiskfs, "mkdir </mount-point> ; mount -t ldiskfs <MDT-device> </mount-point>".

            _ then on each of the affected named-pipe/socket, run the "setfacl -b <mount-point>/ROOT/<relative-path-to-file>" command (where <relative-path-to-file> is the relative path to the file starting at the file-system root from Lustre/Clients point of view) reset all its ACLs to the default set. If you don't know the exact list/paths of the files having the problem, you may be able to find potential ones by running a "find <LustreFS-mountpoint> -type p -print" and "find <LustreFS-mountpoint> -type s -print" from any Client, and then try to access their content via any command like "file" and see if you get "ERROR: cannot open <path/file> (Operation not supported)" error.

            3) as I wrote for 1) the issue is on the MDS side so I presume (Zhenyu correct me if I am wrong, my understanding is that only the MDS side is wrong and has to be fixed here ...) an upgrade of the MDS/Servers should be ok for this particular problem but as usual partial upgrade has to be done only under very specific circumstances and can not be a scenario fully tested nor validated from our side.

            Is this enough detailled and clear for you to report to the customer ??
            Best regards and don't hesitate to ask more/again.
            Bruno.

            bfaccini Bruno Faccini (Inactive) added a comment - Hello Chris, Your answers are what I expected, so we can easily presume that no more/new named pipes/sockets will be created. About customer's questions, here are my answers : 1) the currently affected files will not cause any other problem than the known error each time they are accessed. In fact, the Inode is not corrupted it is just mis-interpreted by the MDS layer. 2) the work-around I found to fix the problem is not what we can say a "standard" procedure for fixing Inodes on the MDS side, and particulary if you run it live/in-parallel of the file-system being mounted and used. So if you/users can leave with such files, just wait for filesystem scheduled down-time to apply it. This said, the exact procedure/commands can be described as : _ on the running/primary MDS mount the MDT device with ldiskfs, "mkdir </mount-point> ; mount -t ldiskfs <MDT-device> </mount-point>". _ then on each of the affected named-pipe/socket, run the "setfacl -b <mount-point>/ROOT/<relative-path-to-file>" command (where <relative-path-to-file> is the relative path to the file starting at the file-system root from Lustre/Clients point of view) reset all its ACLs to the default set. If you don't know the exact list/paths of the files having the problem, you may be able to find potential ones by running a "find <LustreFS-mountpoint> -type p -print" and "find <LustreFS-mountpoint> -type s -print" from any Client, and then try to access their content via any command like "file" and see if you get "ERROR: cannot open <path/file> (Operation not supported)" error. 3) as I wrote for 1) the issue is on the MDS side so I presume (Zhenyu correct me if I am wrong, my understanding is that only the MDS side is wrong and has to be fixed here ...) an upgrade of the MDS/Servers should be ok for this particular problem but as usual partial upgrade has to be done only under very specific circumstances and can not be a scenario fully tested nor validated from our side. Is this enough detailled and clear for you to report to the customer ?? Best regards and don't hesitate to ask more/again. Bruno.

            Got another response back, here it is:

            > Last, are named-pipes/sockets of a common use by your customer ??

            Nope, not on the Lustre FS (so far).

            > Do you think new could be created to run applications in-place ??

            Probably not, but as we're not sure what SW-packages we'll have to install for our bioinformatics people in the future, we can't rule it out.

            > Or were they only old stuff left that was just carried by the rsync ??

            In this case, yes. A "full" rsync copy from another server was done to Lustre, and then when parts of was to be moved into another directory on Lustre, these were accidentally included.

            > Are you and the customer aware that rsync has options to enable/disable these kind of files xfer ??

            We do now... Mostly we've been after the "copy everything, keep all info" approach when using rsync, so we tend to use rsync -a.

            As we have a planned and promised upgrade of Lustre this spring (date
            not decided yet) to a stable 2.X, and can't foresee any immediate need
            for more named-pipes/socket in our Lustre FS, we're not sure if we need
            to patch our setup of 1.8, but we'd like to know:

            1) Are these "untouchable" files in any way harmful to the FS? Could
            there be inode allocation problems, or anything like that? On a system
            level, we guess that we could just avoid them during backup and hence
            more or less forget about them until our upgrade.

            2) The process described to unset the ACLs on the specific files isn't
            clear enough for me to just jump in and fix it. Could it be specified in
            a better step-by-step version? Should I mount the MDT on the active or
            the non-active MDS? Is mounting the FS on an MDS a standard procedure
            for debugging/fixing? I'm not sure if our contract allows us to ask for
            hands-on (remote console) help in solving a task such as this, but it
            would be nice to know.

            As you might have guessed by now, we're a bit cautious about breaking
            the FS right now - it's been a long and repetitive process setting it
            up, with backup problems as an added bonus.

            3) I stated earlier that we're on version 1.8.8 of Lustre, which turns
            out to be only partly true - our clients are, but the MDS & ODS-machines
            are still on 1.8.7. If we decide that patching is needed after all,
            would the patch apply to 1.8.7? Should it be applied to all Lustre
            machines (MDS, ODS, clients...) or a specific set?

            Cheers,

            /Mattias

            chrislocke Chris Locke (Inactive) added a comment - Got another response back, here it is: > Last, are named-pipes/sockets of a common use by your customer ?? Nope, not on the Lustre FS (so far). > Do you think new could be created to run applications in-place ?? Probably not, but as we're not sure what SW-packages we'll have to install for our bioinformatics people in the future, we can't rule it out. > Or were they only old stuff left that was just carried by the rsync ?? In this case, yes. A "full" rsync copy from another server was done to Lustre, and then when parts of was to be moved into another directory on Lustre, these were accidentally included. > Are you and the customer aware that rsync has options to enable/disable these kind of files xfer ?? We do now... Mostly we've been after the "copy everything, keep all info" approach when using rsync, so we tend to use rsync -a. As we have a planned and promised upgrade of Lustre this spring (date not decided yet) to a stable 2.X, and can't foresee any immediate need for more named-pipes/socket in our Lustre FS, we're not sure if we need to patch our setup of 1.8, but we'd like to know: 1) Are these "untouchable" files in any way harmful to the FS? Could there be inode allocation problems, or anything like that? On a system level, we guess that we could just avoid them during backup and hence more or less forget about them until our upgrade. 2) The process described to unset the ACLs on the specific files isn't clear enough for me to just jump in and fix it. Could it be specified in a better step-by-step version? Should I mount the MDT on the active or the non-active MDS? Is mounting the FS on an MDS a standard procedure for debugging/fixing? I'm not sure if our contract allows us to ask for hands-on (remote console) help in solving a task such as this, but it would be nice to know. As you might have guessed by now, we're a bit cautious about breaking the FS right now - it's been a long and repetitive process setting it up, with backup problems as an added bonus. 3) I stated earlier that we're on version 1.8.8 of Lustre, which turns out to be only partly true - our clients are, but the MDS & ODS-machines are still on 1.8.7. If we decide that patching is needed after all, would the patch apply to 1.8.7? Should it be applied to all Lustre machines (MDS, ODS, clients...) or a specific set? Cheers, /Mattias

            People

              bfaccini Bruno Faccini (Inactive)
              chrislocke Chris Locke (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: