Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15009

precreate should cleanup orphans upon error

Details

    • 3
    • 9223372036854775807

    Description

      Sometimes OST object precreate succeeds, but the reply get lost or timed out, in this case MDT doesn't update the known last_id, and just retries. And then subsequent precreate will fail because the last_id known on MDT is less than that on OST. Instead, upon each failure, it should restart as a reconnect and cleanup orphans first, and then continue subsequent precreate.

      Attachments

        Issue Links

          Activity

            [LU-15009] precreate should cleanup orphans upon error

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45930/
            Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 62d4e1d5f1be33abd3ee9f58cc09d471e22ee7ad

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45930/ Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 62d4e1d5f1be33abd3ee9f58cc09d471e22ee7ad

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45930
            Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: ac5034013c97f0e458252b7a337d3f184bd065ba

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45930 Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: ac5034013c97f0e458252b7a337d3f184bd065ba
            pjones Peter Jones added a comment -

            Landed for 2.15

            pjones Peter Jones added a comment - Landed for 2.15

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44984/
            Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 1711e26ae861c28829870c2433caf7ee232909cf

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44984/ Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT Project: fs/lustre-release Branch: master Current Patch Set: Commit: 1711e26ae861c28829870c2433caf7ee232909cf

            "Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44984
            Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: bb30ada9d8f086bc3c2f97450df0fbb68d35f746

            gerrit Gerrit Updater added a comment - "Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44984 Subject: LU-15009 ofd: continue precreate if LAST_ID is less on MDT Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: bb30ada9d8f086bc3c2f97450df0fbb68d35f746
            laisiyao Lai Siyao added a comment -

            The description is updated, the issue is objects are created on OST, but MDT get error, IMO this is no different from a reconnect, and can be handled the same.

            laisiyao Lai Siyao added a comment - The description is updated, the issue is objects are created on OST, but MDT get error, IMO this is no different from a reconnect, and can be handled the same.

            In this case, the MDT should handle this by skipping OST objects below the LAST_ID value, rather than deleting OST objects back to the MDT last_id value. That would leave these objects available for recovery via LFSCK, if the problem was on the MDT (e.g. restored from backup), rather than deleting them.

            adilger Andreas Dilger added a comment - In this case, the MDT should handle this by skipping OST objects below the LAST_ID value, rather than deleting OST objects back to the MDT last_id value. That would leave these objects available for recovery via LFSCK, if the problem was on the MDT (e.g. restored from backup), rather than deleting them.

            People

              laisiyao Lai Siyao
              laisiyao Lai Siyao
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: