Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15012

Unreplayed open leads to version mismatch

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.16.0, Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807

    Description

      00000002:00100000:1.0:1599093179.305780:0:16802:0:(mdc_request.c:911:mdc_close()) @@@ matched open  req@ffff9868388f9b00 x1676760733129664/t30064836004(30064836004) o101->lustre-MDT0001-mdc ffff986938cf2800@192.168.2.11@tcp:12/10 lens 760/840 e 0 to 0 dl 1599093229 ref 1 fl Complete:RP/4/ffffffff rc 0/-1 job:'cp.0'
      

      Uncommitted close removes open from replay list:

      00000100:00080000:0.0:1599093302.301396:0:12330:0:(import.c:86:import_set_state_nolock()) ffff9868364e1800 lustre-MDT0001_UUID: changing import state from CONNECTING to REPLAY
      00000100:00080000:0.0:1599093302.301439:0:12330:0:(import.c:1568:ptlrpc_import_recovery_state_machine()) replay requested by lustre-MDT0001_UUID
      00000100:00100000:0.0:1599093302.301441:0:12330:0:(client.c:2795:ptlrpc_free_committed()) lustre-MDT0001-mdc-ffff986938cf2800: committing for last_committed 30064836076 gen 1
      00000100:00100000:0.0:1599093302.301444:0:12330:0:(client.c:2821:ptlrpc_free_committed()) @@@ stopping search  req@ffff98683a193180 x1676760733148032/t30064836094(30064836094) o36->lustre-MDT0001-mdc-ffff986938cf2800@192.168.2.12@tcp:12/10 lens 488/456 e 0 to 0 dl 1599093230 ref 1 fl Complete:R/4/0 rc 0/0 job:'cp.0'
      00000100:00100000:0.0:1599093302.301455:0:12330:0:(client.c:2848:ptlrpc_free_committed()) @@@ free closed open request  req@ffff9868388f9b00 x1676760733129664/t30064836004(30064836004) o101->lustre-MDT0001-mdc-ffff986938cf2800@192.168.2.12@tcp:12/10 lens 760/840 e 0 to 0 dl 1599093229 ref 1 fl Complete:R/4/ffffffff rc 0/-1 job:'cp.0'
      00000100:00000040:0.0:1599093302.301464:0:12330:0:(client.c:2604:__ptlrpc_req_finished()) @@@ refcount now 1  req@ffff9868388f9b00 x1676760733129664/t30064836004(30064836004) o101->lustre-MDT0001-mdc-ffff986938cf2800@192.168.2.12@tcp:12/10 lens 760/840 e 0 to 0 dl 1599093229 ref 2 fl Complete:RM/4/ffffffff rc 0/-1 job:'cp.0'
      00000100:00080000:0.0:1599093302.301469:0:12330:0:(recover.c:88:ptlrpc_replay_next()) import ffff9868364e1800 from lustre-MDT0001_UUID committed 30064836076 last 0
      

      So unlink from another client destroys the file (move to orphan):

      00000020:00000040:0.0:1599093308.803564:0:2607:0:(tgt_handler.c:579:tgt_handle_recovery()) @@@ Got new replay  req@ffff930c9b201200 x1676760723796928/t0(30064836133) o36->d148e573-bac7-d122-a32d-19499a53d6da@192.168.2.20@tcp:338/0 lens 488/0 e 0 to 0 dl 1599093358 ref 1 fl Complete:/4/ffffffff rc 0/-1 job:'rm.0'
      00000004:00080000:0.0:1599093308.803938:0:2607:0:(mdd_dir.c:1547:mdd_finish_unlink([0x240000bd3:0x5d27:0x0]  open count = 0 is dir 0
      

      and all other requests the file fail with version checking:

      00000100:00000400:0.0:1599093308.966908:0:12330:0:(client.c:3045:ptlrpc_replay_interpret()) @@@ Version mismatch during replay  req@ffff98683fedd200 x1676760733164352/t30064836142(30064836142) o36->lustre-MDT0001-mdc-ffff986938cf2800@192.168.2.12@tcp:12/10 lens 544/440 e 0 to 0 dl 1599093359 ref 2 fl Interpret:R/4/0 rc -75/-75 job:'cp.0'
      

      Attachments

        Issue Links

          Activity

            People

              askulysh Andriy Skulysh
              askulysh Andriy Skulysh
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: