Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2640

deactivate OSD_EXEC_OP() operation accounting if operation is being undone

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.4.0
    • Lustre 2.4.0
    • 4
    • 6174

    Description

      In http://review.whamcloud.com/5046 there was a change to OSD_EXEC_OP() to address problems in MDD operations that fail part-way through, where the undo of earlier operations causes internal osd-ldiskfs declare/execute accounting to fail. For example, in mdd_create()::cleanup, the failed create calls __mdd_index_delete(), mdo_ref_del(), mdo_destroy() in order to clean up the newly created object.

      One proposal is to have some kind of OSD API call/flag/method to mark the transaction handle as being used for rollback, and to disable the ot_declare_op LASSERT() checking in OSD_EXEC_OP() for this case.

      Attachments

        Issue Links

          Activity

            [LU-2640] deactivate OSD_EXEC_OP() operation accounting if operation is being undone

            That depends on how "need" to be defined. From the view of transaction tracing and debugging, it is useful. We may implement such purposes through other ways, but the "thandle::th_rollback" is simple and clean. In fact, the "thandle::th_local" is also used for notification among layers, but not the functional requirement of ldiskfs or DUM. So...

            yong.fan nasf (Inactive) added a comment - That depends on how "need" to be defined. From the view of transaction tracing and debugging, it is useful. We may implement such purposes through other ways, but the "thandle::th_rollback" is simple and clean. In fact, the "thandle::th_local" is also used for notification among layers, but not the functional requirement of ldiskfs or DUM. So...

            the thing is that neither ldiskfs nor dmu need this..

            bzzz Alex Zhuravlev added a comment - the thing is that neither ldiskfs nor dmu need this..

            One idea about the "rollback" notification: introduce a new member thandle::th_rollback, which can be set by MDD and checked by OSD.

            How do you think?

            yong.fan nasf (Inactive) added a comment - One idea about the "rollback" notification: introduce a new member thandle::th_rollback, which can be set by MDD and checked by OSD. How do you think?

            if I remember DMU code correctly, it's fine to declare some range to be modified and then modify it few times - we should be passing internal sanity checks.
            the same applies for ldiskfs for sure.
            given so I think that we don't actually need to declare "rollback" updates as they are working on the same range/hash always.
            notice we probably don't need this mechanism with DMU as it does this internally (when debugging is enabled).

            bzzz Alex Zhuravlev added a comment - if I remember DMU code correctly, it's fine to declare some range to be modified and then modify it few times - we should be passing internal sanity checks. the same applies for ldiskfs for sure. given so I think that we don't actually need to declare "rollback" updates as they are working on the same range/hash always. notice we probably don't need this mechanism with DMU as it does this internally (when debugging is enabled).

            Alex, the 5138 patch was landed as-is to allow later LFSCK patch landings to move forward.

            Oleg wants a better comment explaining what the transaction declaration accounting is doing, so that it can be more easily understood by others.

            Other open questions that need to be discussed:

            • should we implement an OSD API to notify the transaction that there is a rollback in progress?
            • having more accurate transaction accounting is useful for both reducing the transaction credits, and improving the debugging. We do some limited aggregation of declarations for quotas, so that the common case of multiple updates for the same UID/GID do not need too many credits. We could change this to be a more generic mechanism (e.g. identified by FID?) that tracks a few different kinds of updates (e.g. setattr/dirty object, write/append/alloc/punch/truncate, insert/delete item).
            adilger Andreas Dilger added a comment - Alex, the 5138 patch was landed as-is to allow later LFSCK patch landings to move forward. Oleg wants a better comment explaining what the transaction declaration accounting is doing, so that it can be more easily understood by others. Other open questions that need to be discussed: should we implement an OSD API to notify the transaction that there is a rollback in progress? having more accurate transaction accounting is useful for both reducing the transaction credits, and improving the debugging. We do some limited aggregation of declarations for quotas, so that the common case of multiple updates for the same UID/GID do not need too many credits. We could change this to be a more generic mechanism (e.g. identified by FID?) that tracks a few different kinds of updates (e.g. setattr/dirty object, write/append/alloc/punch/truncate, insert/delete item).

            With patch set 4 test looks fine.

            aboyko Alexander Boyko added a comment - With patch set 4 test looks fine.

            Alexander, would you please to try this one? Thanks.

            http://review.whamcloud.com/#change,5138,set4

            yong.fan nasf (Inactive) added a comment - Alexander, would you please to try this one? Thanks. http://review.whamcloud.com/#change,5138,set4

            of course, it's better. though I was actually hoping for a breakthrough solution to improve checks and credits calculation.

            bzzz Alex Zhuravlev added a comment - of course, it's better. though I was actually hoping for a breakthrough solution to improve checks and credits calculation.

            Yes, it's hack. Then what do you prefer to resolve it?

            We can hack in OSD, and let osd_thandle::ot_declare_ops_rb[OSD_OT_REF_ADD] increase when osd_trans_exec_op(OSD_OT_CREATE) called, can you accept it?

            yong.fan nasf (Inactive) added a comment - Yes, it's hack. Then what do you prefer to resolve it? We can hack in OSD, and let osd_thandle::ot_declare_ops_rb [OSD_OT_REF_ADD] increase when osd_trans_exec_op(OSD_OT_CREATE) called, can you accept it?

            well, this is not quite correct. osd_object_destroy() does its own hack because of limits on nlink in ldiskfs. we should not bring this assumption into the API.

            bzzz Alex Zhuravlev added a comment - well, this is not quite correct. osd_object_destroy() does its own hack because of limits on nlink in ldiskfs. we should not bring this assumption into the API.

            Seems we do not need two ref_del() when rollback for mkdir, the osd_object_destroy() can handle the last nlink for dir object.

            Alex, how do you think?

            yong.fan nasf (Inactive) added a comment - Seems we do not need two ref_del() when rollback for mkdir, the osd_object_destroy() can handle the last nlink for dir object. Alex, how do you think?

            People

              yong.fan nasf (Inactive)
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: