Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4402

Ldiskfs errors ldiskfs_ext_find_extent, ldiskfs_ext_get_blocks, corruption

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • Lustre 2.4.1
    • None
    • RHEL6.4/distro IB/2.6.32-358.18.1.el6
    • 2
    • 12084

    Description

      Starting with an otherwise operating filesystem, we had a nfs issue on the management server that does nfsroot to the nodes. This caused the nodes to hang on shell probes, ssh, etc, but lustre appeared to work okay, until the mount came back and there was a spew of I/O errors. We had -o errors=panic, so the nodes rebooted and we have a crash dump as well. A few of the interesting/disturbing messages are below and a complete log capture of the interval is attached. We rebooted every single one of our lustre systems that mounted this nfsroot and started up lustre. At this point, an e2fsck seems prudent given the messages? Please advise.

      And for clarity's sake, this is from a completely separate system than LU-486 that I just updated.

      Dec 19 14:07:28 atlas-oss3b4 kernel: [1987655.565953] end_request: I/O error, dev dm-0, sector 2641342080
      ... more of those ...
      Dec 19 14:07:34 atlas-oss2f4 kernel: [1987662.210829] LDISKFS-fs error (device dm-8): ldiskfs_ext_find_extent: bad header/extent in inode #395571: invalid magic - magic 5fa6, entries 39658, max 42407(0), depth 37176(0)
      Dec 19 14:09:02 atlas-linkfarm kernel: [1987621.160868] LustreError: 13071:0:(tgt_lastrcvd.c:577:tgt_client_new()) linkfarm-MDT0000: Failed to write client lcd at idx 18888, rc -30
      Dec 19 14:09:11 atlas-mds1 kernel: [1355985.425764] LustreError: 15139:0:(osp_precreate.c:484:osp_precreate_send()) atlas1-OST0043-osc-MDT0000: can\'t precreate: rc = -30
      Dec 19 14:09:11 atlas-mds1 kernel: [1355985.439222] LustreError: 15139:0:(osp_precreate.c:484:osp_precreate_send()) Skipped 990 previous similar messages
      Dec 19 14:09:11 atlas-mds1 kernel: [1355985.451122] LustreError: 15139:0:(osp_precreate.c:989:osp_precreate_thread()) atlas1-OST0043-osc-MDT0000: cannot precreate objects: rc = -30
      Dec 19 14:09:11 atlas-mds1 kernel: [1355985.465640] LustreError: 15139:0:(osp_precreate.c:989:osp_precreate_thread()) Skipped 990 previous similar messages
      Dec 19 14:09:55 atlas-mds1 kernel: [1356029.474926] INFO: task mdt00_027:12952 blocked for more than 120 seconds.
      Dec 19 14:10:46 atlas-oss1d1 kernel: [691120.297183] LDISKFS-fs error (device dm-1): ldiskfs_ext_find_extent: bad header/extent in inode #395600: invalid magic - magic a7bc, entries 21131, max 744(0), depth 0(0)
      Dec 19 14:12:42 atlas-oss2d6 kernel: [1987967.199569] Buffer I/O error on device dm-9, logical block 5638
      Dec 19 14:12:42 atlas-oss2d6 kernel: [1987967.199571] lost page write due to I/O error on dm-9
      Dec 19 14:12:42 atlas-oss2d6 kernel: [1987967.199578] LDISKFS-fs error (device dm-9): kmmpd:
      Dec 19 14:12:42 atlas-oss2d6 kernel: [1987967.199582] Aborting journal on device dm-9-8.
      Dec 19 14:12:42 atlas-oss2d6 kernel: [1987967.199586] Buffer I/O error on device dm-9, logical block 137
      Dec 19 14:12:42 atlas-oss2d6 kernel: [1987967.199588] Error writing to MMP block
      Dec 19 14:12:42 atlas-oss2d6 kernel: [1987967.199589] lost page write due to I/O error on dm-9
      Dec 19 14:12:42 atlas-oss2d6 kernel: [1987967.199590]
      Dec 19 14:12:42 atlas-oss2d6 kernel: [1987967.199592] LDISKFS-fs (dm-9): Remounting filesystem read-only
      Dec 19 14:13:25 atlas-oss2f5 kernel: [1988014.344558] LDISKFS-fs error (device dm-6): ldiskfs_ext_find_extent: bad header/extent in inode #268955: invalid magic - magic e79d, entries 37634, max 32686(0), depth 47774(0)

      Dec 19 14:14:50 atlas-oss1a5 kernel: [691359.246757] LDISKFS-fs error (device dm-9): file system corruption: inode #591204 logical block 447 mapped to 137004702958273 (size 1)
      Dec 19 14:14:53 atlas-oss1a5 kernel: [691362.375749] LDISKFS-fs error (device dm-9): file system corruption: inode #591204 logical block 447 mapped to 137004702958273 (size 1)
      Dec 19 14:14:57 atlas-oss1a5 kernel: [691366.367144] LDISKFS-fs error (device dm-9): file system corruption: inode #591204 logical block 447 mapped to 137004702958273 (size 1)
      Dec 19 14:15:02 atlas-oss1a5 kernel: [691371.356427] LDISKFS-fs error (device dm-9): file system corruption: inode #591204 logical block 447 mapped to 137004702958273 (size 1)
      Dec 19 14:15:08 atlas-oss1a5 kernel: [691377.343445] LDISKFS-fs error (device dm-9): file system corruption: inode #591204 logical block 447 mapped to 137004702958273 (size 1)
      Dec 19 14:15:15 atlas-oss1a5 kernel: [691384.328232] LDISKFS-fs error (device dm-9): file system corruption: inode #591204 logical block 447 mapped to 137004702958273 (size 1)
      Dec 19 14:15:16 atlas-oss2e1 kernel: [1988136.371625] LDISKFS-fs error (device dm-3): ldiskfs_ext_find_extent: bad header/extent in inode #330052: invalid magic - magic 53be, entries 21067, max 517(0),
      depth 0(0)
      Dec 19 14:15:23 atlas-oss1a5 kernel: [691392.310962] LDISKFS-fs error (device dm-9): file system corruption: inode #591204 logical block 447 mapped to 137004702958273 (size 1)
      Dec 19 14:15:32 atlas-oss1a5 kernel: [691401.291535] LDISKFS-fs error (device dm-9): file system corruption: inode #591204 logical block 447 mapped to 137004702958273 (size 1)
      Dec 19 14:15:42 atlas-oss1a5 kernel: [691411.269872] LDISKFS-fs error (device dm-9): file system corruption: inode #591204 logical block 447 mapped to 137004702958273 (size 1)

      Dec 19 14:21:50 atlas-oss1d3 kernel: [691780.266369] LDISKFS-fs error (device dm-2): ldiskfs_ext_find_extent: bad header/extent in inode #199528: invalid magic - magic 0, entries 0, max 0(0), depth 0(0
      )

      Dec 19 14:22:41 atlas-oss1d3 kernel: [691831.895277] LDISKFS-fs error (device dm-2): ldiskfs_ext_get_blocks: inode #199528: (comm ll_ost_io02_007) bad extent address iblock: 447, depth: 1 pblock 0

      Attachments

        Activity

          People

            bzzz Alex Zhuravlev
            blakecaldwell Blake Caldwell
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: