Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3457

After power failure 1 OST failed to come up.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • None
    • None
    • TOSS 2.0 Lustre 2.1, DDN SFA10k
    • 3
    • 8642

    Description

      We experienced a power failure last night due to winds slapping two high tension lines together and while restoring everything to working order one OST (luckily only one) would not mount. When I ran fsck on the file system, basically it put EVERYTHING in lost+found and complained loudly about multiply-claimed blocks.

      When the fsck conpleted, I ran ll_recover_lost_found_objs that restored all but ~84MB and after recreating the CONFIGS directory in the root and moving the CONFIGS files back to that directory. I re-ran fsck until no more fixes were made and was able to get OST information via tunefs.lustre and mount the file system. In all about a dozen inodes were affected and I think all but maybe three were not recovered. I think these file were the health check and quota files. (After bringing the file system back online lfs quota -u <file system> failed saying it wasn't enabled. I am in the process of doing a lfs quotacheck now.)

      In any event these fsck messages a ones that I've never seen before and the message that is most concerning is the "boot loader inode" message. As I said, after the first fsck and mounting ldiskfs there was only the lost+found directory. Since we are still in a recovery mode I wanted to find out if there is any thing that I am missing or should have done or do with this particular OST or if there are any concerns that I should be on the lookout for before returning the file system to production. Unfortunately this occurred on a system that prevents me from providing any logs (the output below is hand typed).

      TIA

      ####
      There is 1 inodes containing multiply-clamed blocks

      File <The boot loader inode> (inode $5 mod time Mon Jun 10 13:32:09 2013)
      has 5632 multiply-claimed block(s) shared with one file.
      /O/0/d28/6574428 (inode %1624725, mod time Mon Jun 10 18:23:10 2013)
      Clone multiply-claimed blocks? yes

      clone_file_block: internal error: can't find dup_blk for 3357776535

      clone_file_block: internal error: can't find dup_blk for 3357776535

      File O/0/d28/6574428 (inode %1624725, mod time Mon Jun 10 18:23:10 2013)
      has 5632 multiply-claimed block(s) shared with one file.
      <The boot loader inode> (inode $5 mod time Mon Jun 10 13:32:09 2013)
      Multiply-claimed blocks already reassigned or cloned.
      ####

      Attachments

        Activity

          People

            adilger Andreas Dilger
            jamervi Joe Mervini
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: