Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4701 LFSCK phase II technical debts
  3. LU-3469

OSP dt_sync() operation should flush pending destroys and other updates

Details

    • Technical task
    • Resolution: Fixed
    • Blocker
    • Lustre 2.6.0
    • Lustre 2.6.0
    • None
    • 8679

    Description

      lod_sync() should be changed to allow flushing the pending OST object destroys so that LFSCK can ensure that the OST state is consistent after pass 1 is completed:

      • call local/child OSD dt_sync() first, so that unlink name commit callbacks will be run and schedule all of the OST object destroys
      • OSP dt_sync() will wait until all of the pending object destroys are at least sent to the OSTs (accessing their respective OST objects and removing them from the orphan list). They do not necessarily need to be committed.

      Attachments

        Issue Links

          Activity

            [LU-3469] OSP dt_sync() operation should flush pending destroys and other updates
            pjones Peter Jones added a comment -

            Landed for 2.6

            pjones Peter Jones added a comment - Landed for 2.6

            Assigned to Fan Yong as he has updated the patch

            jlevi Jodi Levi (Inactive) added a comment - Assigned to Fan Yong as he has updated the patch

            Generally, the patch looks good, but we still cannot land the patch because of Maloo test failure.

            yong.fan nasf (Inactive) added a comment - Generally, the patch looks good, but we still cannot land the patch because of Maloo test failure.

            Alex, Fan Yong, what is needed next for this patch? Are further changes needed in the LFSCK code to start using this new functionality?

            adilger Andreas Dilger added a comment - Alex, Fan Yong, what is needed next for this patch? Are further changes needed in the LFSCK code to start using this new functionality?

            http://review.whamcloud.com/10046 - this patch implements osp_sync() which is synchronous. give the discussion above I tend to think it'd be better to call into all OSPs/OSDs directly from LFSCK. in the long term, something like an empty (or almost empty async transactions with commit callbacks registered would serve better being concurrent.

            bzzz Alex Zhuravlev added a comment - http://review.whamcloud.com/10046 - this patch implements osp_sync() which is synchronous. give the discussion above I tend to think it'd be better to call into all OSPs/OSDs directly from LFSCK. in the long term, something like an empty (or almost empty async transactions with commit callbacks registered would serve better being concurrent.

            If osp_sync() is failed to flush all pending destroy to OSTs, and if the LFSCK goes ahead for orphan handling, then the LFSCK may found some OST-object)(s) which parent MDT-object have been destroyed on the MDT(s) by unlink, but the LFSCK cannot be aware of the race unlink, so it will re-create the MDT-object(s), that is unexpected.

            One possible solution is that, if some osp_sync() failed, then skip orphan handling on related OST(s).

            yong.fan nasf (Inactive) added a comment - If osp_sync() is failed to flush all pending destroy to OSTs, and if the LFSCK goes ahead for orphan handling, then the LFSCK may found some OST-object)(s) which parent MDT-object have been destroyed on the MDT(s) by unlink, but the LFSCK cannot be aware of the race unlink, so it will re-create the MDT-object(s), that is unexpected. One possible solution is that, if some osp_sync() failed, then skip orphan handling on related OST(s).

            also, at the moment I'm not sure just a single dt_sync() is good enough. failed osp_sync() isn't any different from failed transaction? if we can't complete one phase on a specific OST we shouldn't continue with that OST in this LFSCK run? IOW, we'd prefer to track per-OST status rather than combined one which doesn't let us learn which OSTs are safe to proceed?

            bzzz Alex Zhuravlev added a comment - also, at the moment I'm not sure just a single dt_sync() is good enough. failed osp_sync() isn't any different from failed transaction? if we can't complete one phase on a specific OST we shouldn't continue with that OST in this LFSCK run? IOW, we'd prefer to track per-OST status rather than combined one which doesn't let us learn which OSTs are safe to proceed?

            like said before, osp_sync() is empty yet, I've been working on this.

            bzzz Alex Zhuravlev added a comment - like said before, osp_sync() is empty yet, I've been working on this.

            Alex, Fan Yong, is there any reason not to just convert the previous comment into a patch? Is there more work to be done?

            adilger Andreas Dilger added a comment - Alex, Fan Yong, is there any reason not to just convert the previous comment into a patch? Is there more work to be done?

            Alex, the current LFSCK does not call it since osp_sync() is not implemented. But it is easy to add it as the following patch:

            diff --git a/lustre/lfsck/lfsck_layout.c b/lustre/lfsck/lfsck_layout.c
            index de96726..7be2346 100644
            --- a/lustre/lfsck/lfsck_layout.c
            +++ b/lustre/lfsck/lfsck_layout.c
            @@ -3435,6 +3435,9 @@ static int lfsck_layout_assistant(void *args)
                                            com->lc_time_last_checkpoint +
                                            cfs_time_seconds(LFSCK_CHECKPOINT_INTERVAL);
             
            +                       /* flush all async updating before handling orphan. */
            +                       dt_sync(env, lfsck->li_next);
            +
                                    while (llmd->llmd_in_double_scan) {
                                            struct lfsck_tgt_descs  *ltds =
                                                                    &lfsck->li_ost_descs;
            
            yong.fan nasf (Inactive) added a comment - Alex, the current LFSCK does not call it since osp_sync() is not implemented. But it is easy to add it as the following patch: diff --git a/lustre/lfsck/lfsck_layout.c b/lustre/lfsck/lfsck_layout.c index de96726..7be2346 100644 --- a/lustre/lfsck/lfsck_layout.c +++ b/lustre/lfsck/lfsck_layout.c @@ -3435,6 +3435,9 @@ static int lfsck_layout_assistant(void *args) com->lc_time_last_checkpoint + cfs_time_seconds(LFSCK_CHECKPOINT_INTERVAL); + /* flush all async updating before handling orphan. */ + dt_sync(env, lfsck->li_next); + while (llmd->llmd_in_double_scan) { struct lfsck_tgt_descs *ltds = &lfsck->li_ost_descs;

            People

              yong.fan nasf (Inactive)
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: