Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1442

File corrupt with 1MiB-aligned 4k regions of zeros

Details

    • 3
    • 4520

    Description

      A data integrity test run periodically run by our storage group found two occurrences of corrupt files written to Lustre. The original files contain 300 MB of random data. The corrupt copies contain several 4096B regions of zeros aligned on 1MiB boundaries. The two corrupt files were written to the same filesystem from two different login nodes on the same cluster within five minutes of each other. The stripe count is 100.

      The client application is a parallel ftp client reading data out of our storage archive into Lustre. The test checks for differences between the restored files and the original copies. For a 300MB file it uses 4 threads which issue 4 64MB pwrite()'s and 1 44MB pwrite(). It is possible that the pwrite() gets restarted due to SIGUSR2 from a master process, though we don't know if this occurred in the corrupting case. This test has seen years of widespread use on all of our clusters, and this is the first reported incidence of this type of corruption, so we can characterize the frequency as rare.

      When I examine an OST object containing a corrupt region, I see there is no block allocated for the corrupt region (in this case, logical block 256 is missing).

      # pigs58 /root > debugfs -c -R "dump_extents /O/0/d$((30205348 % 32))/30205348" /dev/sdb
      debugfs 1.41.12 (17-May-2010)
      /dev/sdb: catastrophic mode - not reading inode or group bitmaps
      Level Entries       Logical              Physical Length Flags
       0/ 0   1/  3     0 -   255 813140480 - 813140735    256
       0/ 0   2/  3   257 -   511 813142528 - 813142782    255
       0/ 0   3/  3   512 -   767 813143040 - 813143295    256
      

      Finally, the following server-side console messages appeared at the same time one of the corrupted files was written, and mention the NID of the implicated client. The consoles of the OSTs containing the corrupt objects were quiet at the time.

      May 17 01:06:08 pigs-mds1 kernel: LustreError: 20418:0:(mdt_recovery.c:1011:mdt_steal_ack_locks()) Resent req xid 1402165306385077 has mismatched opc: new 101 old 0
      May 17 01:06:08 pigs-mds1 kernel: Lustre: 20418:0:(mdt_recovery.c:1022:mdt_steal_ack_locks()) Stealing 1 locks from rs ffff880410f62000 x1402165306385077.t125822723745 o0 NID 192.168.114.155@o2ib5
      May 17 01:06:08 pigs-mds1 kernel: Lustre: All locks stolen from rs ffff880410f62000 x1402165306385077.t125822723745 o0 NID 192.168.114.155@o2ib5
      

      Attachments

        Issue Links

          Activity

            [LU-1442] File corrupt with 1MiB-aligned 4k regions of zeros

            Thanks, Jinshan!

            jaylan Jay Lan (Inactive) added a comment - Thanks, Jinshan!

            Hi Jay Lan, Lu-1717 was just created to address/understand this MDS console error message.

            jay Jinshan Xiong (Inactive) added a comment - Hi Jay Lan, Lu-1717 was just created to address/understand this MDS console error message.

            This message may not introduce any problem, probably just too aggressive console output. I think it can be ignored if you didn't see any real problem.

            jay Jinshan Xiong (Inactive) added a comment - This message may not introduce any problem, probably just too aggressive console output. I think it can be ignored if you didn't see any real problem.

            Hi Jinshan, but our console log messages look the same as the server console log messages shown in the Description section of the ticket. Are you suggesting that other problems can produce the same log messages and thus the messages alone not sufficient evidence? Please advise. Thanks!

            jaylan Jay Lan (Inactive) added a comment - Hi Jinshan, but our console log messages look the same as the server console log messages shown in the Description section of the ticket. Are you suggesting that other problems can produce the same log messages and thus the messages alone not sufficient evidence? Please advise. Thanks!

            I'm going to set this issue as fixed otherwise we can't release 2.3.

            Please reopen this issue if it occurs again.

            jay Jinshan Xiong (Inactive) added a comment - I'm going to set this issue as fixed otherwise we can't release 2.3. Please reopen this issue if it occurs again.

            Hi Jay Lan,

            This error messages are not related as data corruption happened on OST and the message showed something wrong with MDT.

            jay Jinshan Xiong (Inactive) added a comment - Hi Jay Lan, This error messages are not related as data corruption happened on OST and the message showed something wrong with MDT.

            Are the messages below evidence of data corruption? We have quite a number of these on our 2.1.2 mds:

            LustreError: 4869:0:(mdt_recovery.c:1011:mdt_steal_ack_locks()) Resent req xid 1407363869661366 has mismatched opc: new 101 old 0^M
            Lustre: 4869:0:(mdt_recovery.c:1022:mdt_steal_ack_locks()) Stealing 1 locks from rs ffff8802a0930000 x1407363869661366.t210874408121 o0 NID 10.151.26.25@o2ib^M
            Lustre: 4265:0:(service.c:1865:ptlrpc_handle_rs()) All locks stolen from rs ffff8802a0930000 x1407363869661366.t210874408121 o0 NID 10.151.26.25@o2ib^M

            jaylan Jay Lan (Inactive) added a comment - Are the messages below evidence of data corruption? We have quite a number of these on our 2.1.2 mds: LustreError: 4869:0:(mdt_recovery.c:1011:mdt_steal_ack_locks()) Resent req xid 1407363869661366 has mismatched opc: new 101 old 0^M Lustre: 4869:0:(mdt_recovery.c:1022:mdt_steal_ack_locks()) Stealing 1 locks from rs ffff8802a0930000 x1407363869661366.t210874408121 o0 NID 10.151.26.25@o2ib^M Lustre: 4265:0:(service.c:1865:ptlrpc_handle_rs()) All locks stolen from rs ffff8802a0930000 x1407363869661366.t210874408121 o0 NID 10.151.26.25@o2ib^M

            Oh, in other words, I don't know if it fixes the problem because we don't know hot to reproduce it in our test environment. But it hasn't caused any new problems that I know of.

            morrone Christopher Morrone (Inactive) added a comment - Oh, in other words, I don't know if it fixes the problem because we don't know hot to reproduce it in our test environment. But it hasn't caused any new problems that I know of.

            Its been on our test systems and hasn't caused any problems that I am aware of. It is not installed in production yet. It might make it into a production release in a couple of weeks.

            We've seen the LU-1680 failures on the orion branch, and just recently pulled this LU-1442 patch into there. We'll keep an eye out for failures on orion when we upgrade to that version.

            morrone Christopher Morrone (Inactive) added a comment - Its been on our test systems and hasn't caused any problems that I am aware of. It is not installed in production yet. It might make it into a production release in a couple of weeks. We've seen the LU-1680 failures on the orion branch, and just recently pulled this LU-1442 patch into there. We'll keep an eye out for failures on orion when we upgrade to that version.

            Hi Chris Gearing, Sorry I meant to say Christopher Morrone because LLNL is verifying if the patch can fix the data corruption problem. This is a rarely occurred problem so it may take months to verify it.

            jay Jinshan Xiong (Inactive) added a comment - Hi Chris Gearing, Sorry I meant to say Christopher Morrone because LLNL is verifying if the patch can fix the data corruption problem. This is a rarely occurred problem so it may take months to verify it.

            I don't know, any pushes to gerrit that have rebased on this patch will have been run with this patch. I cannot know who has rebased and pushed.

            chris Chris Gearing (Inactive) added a comment - I don't know, any pushes to gerrit that have rebased on this patch will have been run with this patch. I cannot know who has rebased and pushed.

            People

              jay Jinshan Xiong (Inactive)
              nedbass Ned Bass (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: