Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7181

Submitting random writes using 4MB RPC

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • Lustre 2.7.0, Lustre 2.8.0
    • None
    • 3
    • 9223372036854775807

    Description

      After ran "lctl set_param osc.*.max_pages_per_rpc=4M", we ran 4k random write test with IOR and easily hit this problem.

      After some test, we found that the biggest value of msgsize is 16384 for bulk write, while req->rq_reqbuf_len is only 8192.

      Attachments

        Issue Links

          Activity

            [LU-7181] Submitting random writes using 4MB RPC

            Landed as commit v2_8_58_0-8-g7f2aae8.

            adilger Andreas Dilger added a comment - Landed as commit v2_8_58_0-8-g7f2aae8.

            This was fixed as part of patch http://review.whamcloud.com/22369 "LU-8135 osc: limits the number of chunks in write RPC".

            adilger Andreas Dilger added a comment - This was fixed as part of patch http://review.whamcloud.com/22369 " LU-8135 osc: limits the number of chunks in write RPC".

            The patch from LU-4755 increased the OST_MAXREQSIZE to accommodate very large numbers of niobufs in a single request (up to 1024 with a 4MB RPC). However, per my last comments in LU-4755 it isn't clear whether there is an advantage to having so many small IOs in a single large RPC vs. having multiple separate RPCs in parallel.

            t is worthwhile to ask if there is any performance improvement from sending 4096 random pages in one RPC compared to 16 x 256 random pages in separate RPCs? It might even be faster to send parallel RPCs due to checksums running on separate cores and being handled in parallel on the OST. If there is no improvement from many random pages in one RPC, it is better to just limit the number of niobufs that the client sends in one RPC.

            It would be useful to test a random write workload with 1MB, 4MB, and other RPC sizes to see if there is an improvement from sending multiple 1MB RPCs in parallel vs. larger single RPCs.

            adilger Andreas Dilger added a comment - The patch from LU-4755 increased the OST_MAXREQSIZE to accommodate very large numbers of niobufs in a single request (up to 1024 with a 4MB RPC). However, per my last comments in LU-4755 it isn't clear whether there is an advantage to having so many small IOs in a single large RPC vs. having multiple separate RPCs in parallel. t is worthwhile to ask if there is any performance improvement from sending 4096 random pages in one RPC compared to 16 x 256 random pages in separate RPCs? It might even be faster to send parallel RPCs due to checksums running on separate cores and being handled in parallel on the OST. If there is no improvement from many random pages in one RPC, it is better to just limit the number of niobufs that the client sends in one RPC. It would be useful to test a random write workload with 1MB, 4MB, and other RPC sizes to see if there is an improvement from sending multiple 1MB RPCs in parallel vs. larger single RPCs.

            People

              wc-triage WC Triage
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: