Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.7.0
    • Lustre 2.5.1, Lustre 2.7.0
    • 3
    • 15386

    Description

      Buggy code at ptlrpc_connect_interpret()
      finish:
      rc = ptlrpc_import_recovery_state_machine(imp);
      ...
      Set import connection flags
      When import has FULL state ptlrpc_import_recovery_state_machine() wakeup all waiters on import and all delayed request, which was resented. And it could happened that request was send without updated flags and AT is disabled. After that, server could drop resend request if server already processing it and send early reply for client, base on the first incarnation of the request. Client got early reply for request without AT and became confused, touch the buffer outside the reply and fail with EPROTO.

      Attachments

        Issue Links

          Activity

            [LU-5528] Race - connect vs resend

            Patch landed to Master.
            b2_5 patch tracked externally to land.

            jlevi Jodi Levi (Inactive) added a comment - Patch landed to Master. b2_5 patch tracked externally to land.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/11723/
            Subject: LU-5528 ptlrpc: fix race between connect vs resend
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 8645c6f7e95b81dedbc5d47a9ab76947343ed05e

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/11723/ Subject: LU-5528 ptlrpc: fix race between connect vs resend Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8645c6f7e95b81dedbc5d47a9ab76947343ed05e
            aboyko Alexander Boyko added a comment - for master http://review.whamcloud.com/11723

            We have failed in this issue during testing.

            [12485.898910] Lustre: 10447:0:(ldlm_lib.c:1004:target_handle_connect()) lustre-MDT0000: connection from 588e19fc-8e99-b13a-cf77-3d993fb6631e@0@lo t0 exp ffff8801
            27570048 cur 1409434074 last 1409434073
            [12485.903011] Lustre: 10447:0:(ldlm_lib.c:1004:target_handle_connect()) Skipped 2 previous similar messages
            [12485.905135] Lustre: lustre-MDT0000-mdc-ffff88008d3700c8: Connection restored to lustre-MDT0000 (at 0@lo)
            [12485.905809] LustreError: 10440:0:(layout.c:1687:__req_capsule_get()) @@@ Wrong buffer for field `mdt_body' (1 of 1) in format `MDS_REINT_SETATTR': 0 vs. 216 (s
            erver)
            [12485.905810]   req@ffff8800bb670a20 x1477898561126671/t0(0) o36->lustre-MDT0000-mdc-ffff88008d3700c8@0@lo:12/10 lens 456/192 e 0 to 0 dl 1409434081 ref 1 fl Com
            plete:R/2/0 rc 0/0
            [12485.905827] LustreError: 10440:0:(llite_lib.c:1224:ll_md_setattr()) md_setattr fails: rc = -71
            
            aboyko Alexander Boyko added a comment - We have failed in this issue during testing. [12485.898910] Lustre: 10447:0:(ldlm_lib.c:1004:target_handle_connect()) lustre-MDT0000: connection from 588e19fc-8e99-b13a-cf77-3d993fb6631e@0@lo t0 exp ffff8801 27570048 cur 1409434074 last 1409434073 [12485.903011] Lustre: 10447:0:(ldlm_lib.c:1004:target_handle_connect()) Skipped 2 previous similar messages [12485.905135] Lustre: lustre-MDT0000-mdc-ffff88008d3700c8: Connection restored to lustre-MDT0000 (at 0@lo) [12485.905809] LustreError: 10440:0:(layout.c:1687:__req_capsule_get()) @@@ Wrong buffer for field `mdt_body' (1 of 1) in format `MDS_REINT_SETATTR': 0 vs. 216 (s erver) [12485.905810] req@ffff8800bb670a20 x1477898561126671/t0(0) o36->lustre-MDT0000-mdc-ffff88008d3700c8@0@lo:12/10 lens 456/192 e 0 to 0 dl 1409434081 ref 1 fl Com plete:R/2/0 rc 0/0 [12485.905827] LustreError: 10440:0:(llite_lib.c:1224:ll_md_setattr()) md_setattr fails: rc = -71
            aboyko Alexander Boyko added a comment - - edited

            Xyratex: MRP-2034
            patch http://review.whamcloud.com/11540 for b2_5

            aboyko Alexander Boyko added a comment - - edited Xyratex: MRP-2034 patch http://review.whamcloud.com/11540 for b2_5

            People

              liwei Li Wei (Inactive)
              aboyko Alexander Boyko
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: