Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15092

Fix logic for unaligned transfer with o2iblnd

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      It's possible for there to be an offset for the first page of a
      transfer. However, there are two bugs with this code in o2iblnd.

      The first is that this use-case will require LNET_MAX_IOV + 1 local
      RDMA fragments, but we do not specify the correct corresponding values
      for the max page list to ib_alloc_fast_reg_page_list(),
      ib_alloc_fast_reg_mr(), etc.

      The second issue is that the logic in kiblnd_setup_rd_iov() and
      kiblnd_setup_rd_kiov() attempts to obtain one more scatterlist entry
      than is actually needed. This causes the transfer to fail with -EFAULT.

      Attachments

        Issue Links

          Activity

            [LU-15092] Fix logic for unaligned transfer with o2iblnd

            ashehata , just FYI, Andreas' most recent comment is a much clearer statement of my question.

            paf0186 Patrick Farrell added a comment - ashehata , just FYI, Andreas' most recent comment is a much clearer statement of my question.

            I actually asked the same question - not stated as clearly - on https://jira.whamcloud.com/browse/LU-13805 

            paf0186 Patrick Farrell added a comment - I actually asked the same question - not stated as clearly - on https://jira.whamcloud.com/browse/LU-13805  

            Chris, Sereguei,
            we were having a discussion related to LU-13802 (improving buffered read/write efficiency) about whether it is possible to have LNet do RDMA from "very" unaligned buffers on the client (i.e. page-relative memory offset does not match file-relative offset) into page+block-aligned buffers on the server?

            For example, if an application allocates a 1MB buffer in userspace today with glibc malloc(), it is only guaranteed to be aligned on the word size (i.e. 8 bytes). If the client tries to write this unaligned 1MB buffer to a 1MB file-aligned offset, the kernel has to copy all of the data into aligned kernel page cache and then send those page cache pages to LNet for RDMA.

            It would be ideal for large read/write operations if the client LNet could RDMA the unaligned userspace buffer directly into aligned server pages with O_DIRECT, but I don't know if this is a capability that LNet and/or IB/RoCE have, or they require the source/target page alignment to be the same? If this isn't possible, that is totally fine, and we are looking into other solutions to improve performance here, but when I saw this patch recently I just wanted to make sure that there isn't some easy "of course the data does not need to be page aligned" solution that we are missing.

            adilger Andreas Dilger added a comment - Chris, Sereguei, we were having a discussion related to LU-13802 (improving buffered read/write efficiency) about whether it is possible to have LNet do RDMA from "very" unaligned buffers on the client (i.e. page-relative memory offset does not match file-relative offset) into page+block-aligned buffers on the server? For example, if an application allocates a 1MB buffer in userspace today with glibc malloc() , it is only guaranteed to be aligned on the word size (i.e. 8 bytes). If the client tries to write this unaligned 1MB buffer to a 1MB file-aligned offset, the kernel has to copy all of the data into aligned kernel page cache and then send those page cache pages to LNet for RDMA. It would be ideal for large read/write operations if the client LNet could RDMA the unaligned userspace buffer directly into aligned server pages with O_DIRECT , but I don't know if this is a capability that LNet and/or IB/RoCE have, or they require the source/target page alignment to be the same? If this isn't possible, that is totally fine, and we are looking into other solutions to improve performance here, but when I saw this patch recently I just wanted to make sure that there isn't some easy "of course the data does not need to be page aligned" solution that we are missing.
            pjones Peter Jones added a comment -

            Landed for 2.15

            pjones Peter Jones added a comment - Landed for 2.15

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45216/
            Subject: LU-15092 o2iblnd: Fix logic for unaligned transfer
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 23a2c92f203ff2f39bcc083e6b6220968c17b475

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45216/ Subject: LU-15092 o2iblnd: Fix logic for unaligned transfer Project: fs/lustre-release Branch: master Current Patch Set: Commit: 23a2c92f203ff2f39bcc083e6b6220968c17b475

            "Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45216
            Subject: LU-15092 o2iblnd: Fix logic for unaligned transfer
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 40786a123e9d6f6142934c1135ad32015b36368b

            gerrit Gerrit Updater added a comment - "Chris Horn <chris.horn@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45216 Subject: LU-15092 o2iblnd: Fix logic for unaligned transfer Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 40786a123e9d6f6142934c1135ad32015b36368b

            People

              hornc Chris Horn
              hornc Chris Horn
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: