Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8007

Kernel: LustreError: 191208:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 82782 of inode ffff8803a8fd06b8 failed -28

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.5.5
    • TOSS 2.4-7
      lustre-2.5.5-3chaos_2.6.32_573.18.1.1chaos.ch5.4.x86_64.x86_64
      ZFS 0.6.5.4-1.ch5.4.x86_64
    • 3
    • 9223372036854775807

    Description

      On April 6, 2016, we noticed one particular file system getting many of the following errors, which corresponded with user job failures:

      Kernel: LustreError: 191208:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 82782 of inode ffff8803a8fd06b8 failed -28
      

      'lfs df' showed the filesystem was only 76% full; however, it also showed that 32 of the 80 OSTs were 89%-90% full. Since deactivating those 32 "near-full" OSTs on 4/6, we haven't seen the problem on that file system.

      Consequently, we are now seeing the issue on another file system where OSTs are ~ 90% full.

      Attachments

        Issue Links

          Activity

            [LU-8007] Kernel: LustreError: 191208:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 82782 of inode ffff8803a8fd06b8 failed -28
            ofaaland Olaf Faaland added a comment -

            Do you still intend to merge https://review.whamcloud.com/#/c/22567/ to b2_8_fe?  Looks like it got the reviews you asked for.

            ofaaland Olaf Faaland added a comment - Do you still intend to merge https://review.whamcloud.com/#/c/22567/ to b2_8_fe?  Looks like it got the reviews you asked for.

            What about the second LU-2049 patch?

            morrone Christopher Morrone (Inactive) added a comment - What about the second LU-2049 patch?

            The patches for LU-2049 touch a lot of ofd code, but the code was heavily re-factored in the run up to 2.6 (at a minimum would need LU-3467), and this is JUST the ofd directory. This would be a very complex port, and I would deem very risky.

            utopiabound Nathaniel Clark added a comment - The patches for LU-2049 touch a lot of ofd code, but the code was heavily re-factored in the run up to 2.6 (at a minimum would need LU-3467 ), and this is JUST the ofd directory. This would be a very complex port, and I would deem very risky.
            pjones Peter Jones added a comment -

            Nathaniel

            Could you please look into the fesaibility of porting the two mentioned patches to the 2.5 FE branch?

            Thanks

            Peter

            pjones Peter Jones added a comment - Nathaniel Could you please look into the fesaibility of porting the two mentioned patches to the 2.5 FE branch? Thanks Peter

            Looks very similar to what SNL was seeing in LU-7510.

            charr Cameron Harr added a comment - Looks very similar to what SNL was seeing in LU-7510 .

            People

              utopiabound Nathaniel Clark
              charr Cameron Harr
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: