Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9959

hsm: cannot schedule two different requests on the same fid

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      mdt_hsm_add_actions() makes the assumption that hsm_find_compatible() will set the hai_cookie field of incoming requests to something other than 0 only if :

      • the incoming request is a CANCEL request, and there is a request to cancel;
      • the incoming request is a NONE request, and there is a request scheduled for the given fid;
      • the incoming request is an ARCHIVE or a RESTORE or a REMOVE, and the same request is already scheduled.

      The third assumption is used to detect duplicate requests, but hsm_find_compatible() does not exactly work like this. If the incoming request is an ARCHIVE/RESTORE/REMOVE, hsm_find_compatible() will set the hai_cookie field if it finds any request that applies to the same fid.
      This means that an ARCHIVE/RESTORE/REMOVE request on a given fid cannot be scheduled on the coordinator while any other request is still scheduled for that same fid.

      This makes sense in some cases:

      • one cannot run both an ARCHIVE and a RESTORE on the same file (either the archive is not needed, or the file is not released)
      • one cannot run both a RESTORE and a REMOVE on the same file (either the file is not released, or its archive in the backend cannot be purged)

      But not in others:

      • one can want to ARCHIVE the new version of a file while the previous version is being REMOVEd from the hsm backend.
      • one can want to re-run an {ARCHIVE,RESTORE,REMOVE}
        request while another is being canceled.

      Attachments

        Issue Links

          Activity

            [LU-9959] hsm: cannot schedule two different requests on the same fid
            pjones Peter Jones added a comment -

            In that case I think that it makes sense for you to own this ticket for the time being. I also added a link to LU-7988.

            pjones Peter Jones added a comment - In that case I think that it makes sense for you to own this ticket for the time being. I also added a link to LU-7988 .

            Hi Peter,

            A little of both. Ben Evans is working on the LU-7988 patch series and the last patch of the series is about calling hsm_find_compatible() less often. I am hoping we can fix both issues at once. If not, I will eventually submit a patch, but probably not until October or November.

            Quentin

            bougetq Quentin Bouget (Inactive) added a comment - Hi Peter, A little of both. Ben Evans is working on the LU-7988 patch series and the last patch of the series is about calling hsm_find_compatible() less often. I am hoping we can fix both issues at once. If not, I will eventually submit a patch, but probably not until October or November. Quentin
            pjones Peter Jones added a comment -

            Quentin

            Is this issue something that you are planning to work on or something you are documenting in the hope that someone else will work on it?

            Peter

            pjones Peter Jones added a comment - Quentin Is this issue something that you are planning to work on or something you are documenting in the hope that someone else will work on it? Peter

            People

              bougetq Quentin Bouget (Inactive)
              cealustre CEA
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: