Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16431

Close request is dropped during replay

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      It is reproduced with replay-dual/26

      00010000:00080000:0.0:1647814116.268086:0:14276:0:(ldlm_lib.c:2020:check_for_next_transno()) waking for next (8589940272)
      00010000:00000001:0.0:1647814116.268087:0:14276:0:(ldlm_lib.c:2380:get_next_transno()) Process entered
      00000020:00080000:0.0:1647814116.268087:0:14276:0:(update_recovery.c:605:distribute_txn_get_next_transno()) lustre-MDT0000: Next update transno 8589940272
      00010000:00000001:0.0:1647814116.268088:0:14276:0:(ldlm_lib.c:2397:get_next_transno()) Process leaving (rc=8589940272 : 8589940272 : 200001630)
      00000020:00000001:0.0:1647814116.268091:0:14276:0:(update_recovery.c:1313:distribute_txn_replay_handle()) Process entered
      00000020:00080000:0.0:1647814116.268098:0:14276:0:(update_records.c:74:update_records_dump()) master transno = 8589940272 batchid = 4294967706 flags = 0 ops = 4 params = 3
      00000020:00080000:0.0:1647814116.268101:0:14276:0:(update_records.c:93:update_records_dump()) update 0th [0x200000403:0x3:0x0] attr_set params_count = 1
      00000020:00080000:0.0:1647814116.268103:0:14276:0:(update_records.c:108:update_records_dump()) param = ffffa00a600b568a 0th off = 0 size = 208
      00000020:00080000:0.0:1647814116.268106:0:14276:0:(update_records.c:93:update_records_dump()) update 1th [0x200000400:0xa:0x0] attr_set params_count = 1
      00000020:00080000:0.0:1647814116.268108:0:14276:0:(update_records.c:108:update_records_dump()) param = ffffa00a600b568a 0th off = 0 size = 208
      00000020:00080000:0.0:1647814116.268109:0:14276:0:(update_records.c:93:update_records_dump()) update 2th [0x240000401:0xa:0x0] attr_set params_count = 1
      00000020:00080000:0.0:1647814116.268111:0:14276:0:(update_records.c:108:update_records_dump()) param = ffffa00a600b568a 0th off = 0 size = 208
      00000020:00080000:0.0:1647814116.268113:0:14276:0:(update_records.c:93:update_records_dump()) update 3th [0x200000001:0x15:0x0] write params_count = 2
      00000020:00080000:0.0:1647814116.268115:0:14276:0:(update_records.c:108:update_records_dump()) param = ffffa00a600b5762 0th off = 1 size = 32
      00000020:00080000:0.0:1647814116.268116:0:14276:0:(update_records.c:108:update_records_dump()) param = ffffa00a600b578a 1th off = 2 size = 8
      (update_recovery.c:716:update_is_committed()) Update of [0x200000403:0x3:0x0]on MDT0 is not committed
      
      00010000:00080000:0.0:1647814116.269438:0:14276:0:(ldlm_lib.c:2016:check_for_next_transno()) waking for duplicate req (8589940272)
      00010000:00000001:0.0:1647814116.269439:0:14276:0:00010000:00080000:0.0:1647814116.269450:0:14276:0:(ldlm_lib.c:2418:drop_duplicate_replay_req()) @@@ remove t8589940272 from 192.168.101.19@tcp because of duplicate update records are found.
        req@ffffa00af4688050 x1727858180388096/t0(8589940272) o35->c9ceeeb0-66fe-f7a3-7273-2f5ddf46ff11@192.168.101.19@tcp:197/0 lens 392/0 e 0 to 0 dl 1647814122 ref 1 fl Complete:/4/ffffffff rc 0/-1 job:'dbench.0'
      00010000:00020000:0.0:1647814116.269456:0:14276:0:(ldlm_lib.c:2432:drop_duplicate_replay_req()) @@@ wrong opc 35 from 192.168.101.19@tcp
        req@ffffa00af4688050 x1727858180388096/t0(8589940272) o35->c9ceeeb0-66fe-f7a3-7273-2f5ddf46ff11@192.168.101.19@tcp:197/0 lens 392/0 e 0 to 0 dl 1647814122 ref 1 fl Complete:/4/ffffffff rc 0/-1 job:'dbench.0'
      

      Attachments

        Activity

          People

            askulysh Andriy Skulysh
            askulysh Andriy Skulysh
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: