Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6924

remote regular file are missing after recovery.

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.8.0
    • Lustre 2.8.0
    • None
    • 3
    • 9223372036854775807

    Description

      In 24 hours DNE failover test. I found this on one of the MDT,

      LustreError: 2758:0:(client.c:2869:ptlrpc_replay_interpret()) @@@ status -110, old was 0  req@ffff880feb148cc0 x1507974808149044/t25771723485(25771723485) o1000->lustre-MDT0003-osp-MDT0001@192.168.2.128@o2ib:24/4 lens 248/16576 e 1 to 0 dl 1438129486 ref 2 fl Interpret:R/4/0 rc -110/-110
      Lustre: lustre-MDT0003-osp-MDT0001: Connection restored to lustre-MDT0003 (at 192.168.2.128@o2ib)
      LustreError: 3117:0:(mdt_open.c:1171:mdt_cross_open()) lustre-MDT0001: [0x240000406:0x167f1:0x0] doesn't exist!: rc = -14
      Lustre: DEBUG MARKER: ==== Checking the clients loads BEFORE failover -- failure NOT OK ELAPSED=27221 DURATION=86400 PERIOD=1800
      Lustre: DEBUG MARKER: Client load failed on node c05, rc=1
      

      Then on the client side, which cause dbench fails

         2      7136     0.00 MB/sec  execute 191 sec  latency 272510.369 ms
         2      7136     0.00 MB/sec  execute 192 sec  latency 273510.512 ms
         2      7136     0.00 MB/sec  execute 193 sec  latency 274510.637 ms
         2      7136     0.00 MB/sec  execute 194 sec  latency 275510.799 ms
         2      7136     0.00 MB/sec  execute 195 sec  latency 276510.916 ms
         2      7136     0.00 MB/sec  execute 196 sec  latency 277511.069 ms
         2      7136     0.00 MB/sec  execute 197 sec  latency 278511.229 ms
         2      7136     0.00 MB/sec  execute 198 sec  latency 279511.387 ms
         2      7330     0.00 MB/sec  execute 199 sec  latency 280182.929 ms
      [9431] open ./clients/client1/~dmtmp/EXCEL/RESULTS.XLS failed for handle 11887 (Bad address)
      (9432) ERROR: handle 11887 was not found
      Child failed with status 1
      

      Then the test fails.

      Attachments

        Issue Links

          Activity

            [LU-6924] remote regular file are missing after recovery.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15793/
            Subject: LU-6924 ptlrpc: replay bulk request
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 0addfa9fa1d48cc9fa5eb05026848e55382f81a8

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15793/ Subject: LU-6924 ptlrpc: replay bulk request Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0addfa9fa1d48cc9fa5eb05026848e55382f81a8

            wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/15793
            Subject: LU-6924 ptlrpc: replay bulk request
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 722bbb86fa5479bdc16b62b43863eff39a61df56

            gerrit Gerrit Updater added a comment - wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/15793 Subject: LU-6924 ptlrpc: replay bulk request Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 722bbb86fa5479bdc16b62b43863eff39a61df56
            di.wang Di Wang added a comment -

            So the easiest fix might be in step 7. If it is bulk replay, and even though the server get the request, but the bulk transfer timeout, then we will still resend the replay request.

            di.wang Di Wang added a comment - So the easiest fix might be in step 7. If it is bulk replay, and even though the server get the request, but the bulk transfer timeout, then we will still resend the replay request.
            di.wang Di Wang added a comment - - edited

            Hmm, I do not have enough debug to know what happens, but it mostly like

            1. MDS02 do remote unlink, so it destroy local object, then delete the remote name entry on MDS04.
            2. But MDS04 restarts at the moment, after it restarts, it will wait all clients connected, then collecting the debug log.
            3. After MDS02 reconnects to MDS04, it will send replay unlink to MDS04, MDS04 got the unlink request and wait for the BULK.
            4. At the same time, MDS02 evict MDS04

            Lustre: lustre-MDT0001: already connected client lustre-MDT0003-mdtlov_UUID (at 192.168.2.128@o2ib) with handle 0x2e45787e4dd12a1. Rejecting client with the same UUID trying to reconnect with handle 0xf0284dfd774c7787
            Lustre: lustre-MDT0001: haven't heard from client lustre-MDT0003-mdtlov_UUID (at 192.168.2.128@o2ib) in 228 seconds. I think it's dead, and I am evicting it. exp ffff880ffa181c00, cur 1438129440 expire 1438129290 last 1438129212
            

            5. MDS04 failed on waiting bulk transfer

            LustreError: 2792:0:(ldlm_lib.c:3041:target_bulk_io()) @@@ network error on bulk WRITE  req@ffff880827864850 x1507974808149044/t0(25771723485) o1000->lustre-MDT0001-mdtlov_UUID@192.168.2.126@o2ib:219/0 lens 248/16608 e 1 to 0 dl 1438129504 ref 1 fl Complete:/4/0 rc 0/0
            

            6. MDS02 failed on this unlink replay

            LustreError: 2758:0:(client.c:2869:ptlrpc_replay_interpret()) @@@ status -110, old was 0  req@ffff880feb148cc0 x1507974808149044/t25771723485(25771723485) o1000->lustre-MDT0003-osp-MDT0001@192.168.2.128@o2ib:24/4 lens 248/16576 e 1 to 0 dl 1438129486 ref 2 fl Interpret:R/4/0 rc -110/-110
            

            7 Because MDS02 already got reply of this replay (note: this is bulk replay), so it will not replay this request again. (see ptlrpc_replay_interpret()).

            di.wang Di Wang added a comment - - edited Hmm, I do not have enough debug to know what happens, but it mostly like 1. MDS02 do remote unlink, so it destroy local object, then delete the remote name entry on MDS04. 2. But MDS04 restarts at the moment, after it restarts, it will wait all clients connected, then collecting the debug log. 3. After MDS02 reconnects to MDS04, it will send replay unlink to MDS04, MDS04 got the unlink request and wait for the BULK. 4. At the same time, MDS02 evict MDS04 Lustre: lustre-MDT0001: already connected client lustre-MDT0003-mdtlov_UUID (at 192.168.2.128@o2ib) with handle 0x2e45787e4dd12a1. Rejecting client with the same UUID trying to reconnect with handle 0xf0284dfd774c7787 Lustre: lustre-MDT0001: haven't heard from client lustre-MDT0003-mdtlov_UUID (at 192.168.2.128@o2ib) in 228 seconds. I think it's dead, and I am evicting it. exp ffff880ffa181c00, cur 1438129440 expire 1438129290 last 1438129212 5. MDS04 failed on waiting bulk transfer LustreError: 2792:0:(ldlm_lib.c:3041:target_bulk_io()) @@@ network error on bulk WRITE req@ffff880827864850 x1507974808149044/t0(25771723485) o1000->lustre-MDT0001-mdtlov_UUID@192.168.2.126@o2ib:219/0 lens 248/16608 e 1 to 0 dl 1438129504 ref 1 fl Complete:/4/0 rc 0/0 6. MDS02 failed on this unlink replay LustreError: 2758:0:(client.c:2869:ptlrpc_replay_interpret()) @@@ status -110, old was 0 req@ffff880feb148cc0 x1507974808149044/t25771723485(25771723485) o1000->lustre-MDT0003-osp-MDT0001@192.168.2.128@o2ib:24/4 lens 248/16576 e 1 to 0 dl 1438129486 ref 2 fl Interpret:R/4/0 rc -110/-110 7 Because MDS02 already got reply of this replay (note: this is bulk replay), so it will not replay this request again. (see ptlrpc_replay_interpret()).

            People

              di.wang Di Wang
              di.wang Di Wang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: