Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5939

Error: trying to overwrite bigger transno

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.8.0
    • Lustre 2.7.0
    • OpenSFS cluster running lustre-master tag 2.6.90 build #2745 with one MDS/MDT, three OSSs with two OSTs each and three clients.
    • 3
    • 16583

    Description

      I've been running sanity-hsm test 90 several time on this cluster and nearly every time I run the test, I see the following in dmesg on the MDS:

      Lustre: DEBUG MARKER: == sanity-hsm test 90: Archive/restore a file list == 15:39:24 (1416440364)
      Lustre: HSM agent bb8c2497-7403-4909-0e46-6614668e8ed7 already registered
      LustreError: 26047:0:(mdt_coordinator.c:957:mdt_hsm_cdt_start()) scratch-MDT0000: Coordinator already started
      LustreError: 19956:0:(tgt_lastrcvd.c:806:tgt_last_rcvd_update()) scratch-MDT0000: trying to overwrite bigger transno:on-disk: 25769818612, new: 25769818611 replay: 0. see LU-617.
      LustreError: 19956:0:(tgt_lastrcvd.c:806:tgt_last_rcvd_update()) Skipped 5 previous similar messages
      Lustre: DEBUG MARKER: == sanity-hsm test complete, duration 37 sec == 15:39:50 (1416440390)
      

      From the kernel logs, I see:

      ...
      00000001:00020000:9.0:1416440377.839622:0:19956:0:(tgt_lastrcvd.c:806:tgt_last_rcvd_update()) scratch-MDT0000: trying to overwrite bigger transno:on-disk: 25769818612, new: 25769818611 replay: 0. see LU-617.
      ...
      00000001:00080000:8.0:1416440377.869378:0:30331:0:(tgt_lastrcvd.c:1231:tgt_txn_stop_cb()) More than one transaction 25769818612
      ...
      00000001:00080000:8.0:1416440377.869423:0:30331:0:(tgt_lastrcvd.c:1231:tgt_txn_stop_cb()) More than one transaction 25769818612
      ...
      00000001:00080000:8.0:1416440377.869508:0:30331:0:(tgt_lastrcvd.c:1231:tgt_txn_stop_cb()) More than one transaction 25769818612
      ...
      00000100:00100000:8.0:1416440377.869685:0:30331:0:(service.c:2116:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc mdt00_002:bb8c2497-7403-4909-0e46-6614668e8ed7+713:21533:x1485210712561904:12345-192.168.2.111@o2ib:57 Request procesed in 30116us (30167us total) trans 25769818612 rc 0/0
      

      Similarly for other transaction numbers:

      00000001:00020000:0.0:1416440378.133498:0:19955:0:(tgt_lastrcvd.c:806:tgt_last_rcvd_update()) scratch-MDT0000: trying to overwrite bigger transno:on-disk: 25769818617, new: 25769818614 replay: 0. see LU-617.
      

      and

      00000001:00020000:1.0F:1416440378.133518:0:31313:0:(tgt_lastrcvd.c:806:tgt_last_rcvd_update()) scratch-MDT0000: trying to overwrite bigger transno:on-disk: 25769818619, new: 25769818618 replay: 0. see LU-617.
      

      Before running sanity-hsm test 90, the copytool was started on the agent, c11.

      Attachments

        Issue Links

          Activity

            [LU-5939] Error: trying to overwrite bigger transno

            I am not sure it is about llog records only, llog_cat_add() cause local transaction which produce no transaction number, there must be another update, maybe attributes of file or something like that? I can give more details about HSM request type and operations behind multiple transno later today. Meanwhile, llog_cat_add() should be replaced with llog_add() in any case.

            As for putting everything into single transaction, we still have another way to go - use the same mechanism as OUT uses to control batch of updates. This will cause compatibility problem but maybe it is not so difficult to solve. I mean we shouldn't deny this case completely and review it too. This is context of LU-6223 though.

            tappro Mikhail Pershin added a comment - I am not sure it is about llog records only, llog_cat_add() cause local transaction which produce no transaction number, there must be another update, maybe attributes of file or something like that? I can give more details about HSM request type and operations behind multiple transno later today. Meanwhile, llog_cat_add() should be replaced with llog_add() in any case. As for putting everything into single transaction, we still have another way to go - use the same mechanism as OUT uses to control batch of updates. This will cause compatibility problem but maybe it is not so difficult to solve. I mean we shouldn't deny this case completely and review it too. This is context of LU-6223 though.

            The problem is we should declare the number of credits we need for the transaction in advance. So we need to also update the credit declaration.

            adegremont Aurelien Degremont (Inactive) added a comment - The problem is we should declare the number of credits we need for the transaction in advance. So we need to also update the credit declaration.

            after a second thought, we don't even need to add a parameter into llog_cat_add(). We just need to call llog_add() series of interfaces instead, just as what we do for changelog.

            jay Jinshan Xiong (Inactive) added a comment - after a second thought, we don't even need to add a parameter into llog_cat_add(). We just need to call llog_add() series of interfaces instead, just as what we do for changelog.
            jay Jinshan Xiong (Inactive) added a comment - - edited

            Exactly, llog_cat_add() can be revised to carry a transaction handler parameter therefore we can start a transaction in mdt_hsm_add_actions() and use it for all llog operations later.

            The only concern is about the size of the transaction. I remember that there is a limitation for it, but I'm not an OSD expert. If that is the case, we also need to take log file creation into account for the transaction size.

            jay Jinshan Xiong (Inactive) added a comment - - edited Exactly, llog_cat_add() can be revised to carry a transaction handler parameter therefore we can start a transaction in mdt_hsm_add_actions() and use it for all llog operations later. The only concern is about the size of the transaction. I remember that there is a limitation for it, but I'm not an OSD expert. If that is the case, we also need to take log file creation into account for the transaction size.

            IIRC, HSM_REQUEST store a list of requests to be done in a llog. One RPC can send request for the same action (archive, restore, ...) for a list of files. One llog record will be added for each files (with the same compound_id to be able to rebuilt this request later).

            Records are added using llog_cat_add(). If we want to have only one transaction, we need a special version which can add several records in one call, and update mdt_hsm_add_actions() accordingly.

            adegremont Aurelien Degremont (Inactive) added a comment - IIRC, HSM_REQUEST store a list of requests to be done in a llog. One RPC can send request for the same action (archive, restore, ...) for a list of files. One llog record will be added for each files (with the same compound_id to be able to rebuilt this request later). Records are added using llog_cat_add() . If we want to have only one transaction, we need a special version which can add several records in one call, and update mdt_hsm_add_actions() accordingly.

            Yes, I agree, that would be better

            tappro Mikhail Pershin added a comment - Yes, I agree, that would be better

            hmm I'm not comfortable with mutilple transactions can be made by HSM requests because I'm afraid it may have problems down the road. OUT can have multiple transaction for one RPC because it was carefully designed for this, but it can't be applied to HSM. I'd like to have an alternative way to fix this problem by limiting that HSM request can have only one trans per RPC.

            jay Jinshan Xiong (Inactive) added a comment - hmm I'm not comfortable with mutilple transactions can be made by HSM requests because I'm afraid it may have problems down the road. OUT can have multiple transaction for one RPC because it was carefully designed for this, but it can't be applied to HSM. I'd like to have an alternative way to fix this problem by limiting that HSM request can have only one trans per RPC.
            tappro Mikhail Pershin added a comment - - edited

            Note, the patch above makes just HSM requests repayable, it contains no tests because I found that HSM actions can't recover for reasons not related to this particular patch. I've create LU-6223 for HSM recovery testing.

            So this particular patch should solve just problem with "overwrite bigger transno" message.

            tappro Mikhail Pershin added a comment - - edited Note, the patch above makes just HSM requests repayable, it contains no tests because I found that HSM actions can't recover for reasons not related to this particular patch. I've create LU-6223 for HSM recovery testing. So this particular patch should solve just problem with "overwrite bigger transno" message.

            Mike Pershin (mike.pershin@intel.com) uploaded a new patch: http://review.whamcloud.com/13684
            Subject: LU-5939 hsm: make HSM modification requests replayable
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 726362d38cb95233acabacb3fe98ed484f0bff4e

            gerrit Gerrit Updater added a comment - Mike Pershin (mike.pershin@intel.com) uploaded a new patch: http://review.whamcloud.com/13684 Subject: LU-5939 hsm: make HSM modification requests replayable Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 726362d38cb95233acabacb3fe98ed484f0bff4e

            Thanks for help, I identified first two as well, but not MDS_HSM_REQUEST, is it specific action which update data on disk or it does that upon any request?

            tappro Mikhail Pershin added a comment - Thanks for help, I identified first two as well, but not MDS_HSM_REQUEST, is it specific action which update data on disk or it does that upon any request?

            On my side, I've identified 3 RPCs which can update data on server

            • MDS_HSM_PROGRESS
            • MDS_HSM_STATE_SET
            • MDS_HSM_REQUEST

            Serializing those requests can slow request ingestion rate but I think it is acceptable.

            If you're working on a patch, please also add MUTABOR flag to MDS_HSM_REQUEST RPC.

            adegremont Aurelien Degremont (Inactive) added a comment - On my side, I've identified 3 RPCs which can update data on server MDS_HSM_PROGRESS MDS_HSM_STATE_SET MDS_HSM_REQUEST Serializing those requests can slow request ingestion rate but I think it is acceptable. If you're working on a patch, please also add MUTABOR flag to MDS_HSM_REQUEST RPC.

            People

              tappro Mikhail Pershin
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: