Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.3.0
    • None
    • None
    • 4572

    Description

      Lustre use crypto hashes algo in two ways: PTLRPC (ptlrpc/sec_bulk.c), OST(ost/ost_handler.c)/OSC (osc/osc_request.c).
      OST,OSC use crc32, crc32c, adler for checksumming (compute_checksum() function)
      PTLRPC uses crc32, adler, md5, sha1-512 ( kernel crypto api for all, excluding crc32 adler)
      All subsystems go through bulk pages and update checksum.
      To resolve conflicts with different implementation of checksumming, a new crypto hash interface is needed at libcfs. It should use kernel crypto api for hash calculation for kernel modules, and lustre implementation for user mode. Previus checksum calculation should be changed to the new libcfs crypto hash api. And adding new hash algo would be a simple task.

      Attachments

        Issue Links

          Activity

            [LU-1201] Lustre crypto hash cleanup
            aboyko Alexander Boyko added a comment - - edited

            Can you attach your config.h? Looks like invalid configure, I can`t reproduce issue on the same kernel.

            aboyko Alexander Boyko added a comment - - edited Can you attach your config.h? Looks like invalid configure, I can`t reproduce issue on the same kernel.

            Hit kernel panic with this patches on Sandybridge server. I just filed on LU-1379.

            ihara Shuichi Ihara (Inactive) added a comment - Hit kernel panic with this patches on Sandybridge server. I just filed on LU-1379 .

            I had couple of benchmarks on 1.8 and 2.2 for comparison, but not yet with this patches. I'm thinking to do same benchmarking on 1.8 and 2.2 with this patches in a couple of days.
            btw, we are seeing single client's performance regressions on 2.x due to LU-744. So, at this moment, we might need to run with file size < client's memory size for more fair comparison.. However, In the case of file size > client's memory size, we still see less client's performance regardless checksum=on/off.

            ihara Shuichi Ihara (Inactive) added a comment - I had couple of benchmarks on 1.8 and 2.2 for comparison, but not yet with this patches. I'm thinking to do same benchmarking on 1.8 and 2.2 with this patches in a couple of days. btw, we are seeing single client's performance regressions on 2.x due to LU-744 . So, at this moment, we might need to run with file size < client's memory size for more fair comparison.. However, In the case of file size > client's memory size, we still see less client's performance regardless checksum=on/off.

            Alexander, Ihara,
            have you done any performance comparison with checksums enabled on 1.8 for comparison?

            adilger Andreas Dilger added a comment - Alexander, Ihara, have you done any performance comparison with checksums enabled on 1.8 for comparison?

            I do test on RH5 with kernel version 2.6.18-238.19.1, it support crc32c hw (kernel compile), and have better crc32c performance than 2.6.32-... kernels.

            aboyko Alexander Boyko added a comment - I do test on RH5 with kernel version 2.6.18-238.19.1, it support crc32c hw (kernel compile), and have better crc32c performance than 2.6.32-... kernels.

            Ihara, while it is true that there may be some minor degradation in the case of RHEL 5 clients, based on the test results you posted this performance loss will be minor (or still an improvement over earlier versions of Lustre) due to the multi-threaded ptlrpcd speedups.

            Also, the chance of users wanting to stick with RHEL5 for stanility, but moving to a new development version of Lustre is not very likely.

            adilger Andreas Dilger added a comment - Ihara, while it is true that there may be some minor degradation in the case of RHEL 5 clients, based on the test results you posted this performance loss will be minor (or still an improvement over earlier versions of Lustre) due to the multi-threaded ptlrpcd speedups. Also, the chance of users wanting to stick with RHEL5 for stanility, but moving to a new development version of Lustre is not very likely.

            A quick question.
            Do we keep RHE5 patchless client on lustre-2.3?
            If so, RHEL5's client may see some performance drops on 2.3 from 2.2 if they upgrade.
            Because, crc32c is implemented in the lustre itself and crc32c-intel is not available on that kernel, so alder will be selected on 2.3.

            ihara Shuichi Ihara (Inactive) added a comment - A quick question. Do we keep RHE5 patchless client on lustre-2.3? If so, RHEL5's client may see some performance drops on 2.3 from 2.2 if they upgrade. Because, crc32c is implemented in the lustre itself and crc32c-intel is not available on that kernel, so alder will be selected on 2.3.
            nrutman Nathan Rutman added a comment - - edited

            That sounds like a good solution:
            1. Client sends all available local algos
            2. server drops non-client-supported from list, then replies with remaining algos that perform at 50% or better of the fastest algo remaining on server
            3. client chooses locally fastest of what's left as the default.

            nrutman Nathan Rutman added a comment - - edited That sounds like a good solution: 1. Client sends all available local algos 2. server drops non-client-supported from list, then replies with remaining algos that perform at 50% or better of the fastest algo remaining on server 3. client chooses locally fastest of what's left as the default.
            adilger Andreas Dilger added a comment - - edited

            If the concern is the client eliminating the checksum algorithm choices prematurely, then the client could send all of the known algorithms to the OST. Only the OST would drop the algorithms that are totally underperforming (e.g. crc32 or crc32c in s/w). The final performance/config selection would be in the hands of the client/administrator. That would ensure that the client doesn't pick a locally fast algorithm (e.g. crc32c h/w) that would significantly impact the OST CPU usage (e.g. crc32c s/w), without reducing the (sensible) choices available to the user.

            adilger Andreas Dilger added a comment - - edited If the concern is the client eliminating the checksum algorithm choices prematurely, then the client could send all of the known algorithms to the OST. Only the OST would drop the algorithms that are totally underperforming (e.g. crc32 or crc32c in s/w). The final performance/config selection would be in the hands of the client/administrator. That would ensure that the client doesn't pick a locally fast algorithm (e.g. crc32c h/w) that would significantly impact the OST CPU usage (e.g. crc32c s/w), without reducing the (sensible) choices available to the user.

            I guess the caveat is that if the algorithm is excluded because it is much slower than the others, then it may not be available for selection. That is true of the code today also - if crc32c is not in hardware on either the client or server, then it is excluded and couldn't be selected by the administrator either.

            Exactly my concern. If there are tons of clients and few servers, it may make sense to use the fastest server algorithm regardless of the client algo speed to optimize the overall system rate. That's the reason we wanted to be able to use the SW crc-32c algo (on the clients) in the first place.

            nrutman Nathan Rutman added a comment - I guess the caveat is that if the algorithm is excluded because it is much slower than the others, then it may not be available for selection. That is true of the code today also - if crc32c is not in hardware on either the client or server, then it is excluded and couldn't be selected by the administrator either. Exactly my concern. If there are tons of clients and few servers, it may make sense to use the fastest server algorithm regardless of the client algo speed to optimize the overall system rate. That's the reason we wanted to be able to use the SW crc-32c algo (on the clients) in the first place.

            I've never tested this patches on 2.2. All results were on vanilla-2.2 branch.
            Here is single client's performance ratio(tested with 12 IOzone threads). This is tested on real disks, instead of ramdisk. The results overall, it seems to be reasonable.

            Performance ratio(Write)
                        v2.2
            nochecksum  1.00
            adler       0.88
            crc32       0.68
            crc32c      0.93
            
            Performance ratio(Read)
                        v2.2
            nochecksum  1.00
            adler       0.88
            crc32       0.71
            crc32c      0.91
            
            ihara Shuichi Ihara (Inactive) added a comment - I've never tested this patches on 2.2. All results were on vanilla-2.2 branch. Here is single client's performance ratio(tested with 12 IOzone threads). This is tested on real disks, instead of ramdisk. The results overall, it seems to be reasonable. Performance ratio(Write) v2.2 nochecksum 1.00 adler 0.88 crc32 0.68 crc32c 0.93 Performance ratio(Read) v2.2 nochecksum 1.00 adler 0.88 crc32 0.71 crc32c 0.91

            People

              wc-triage WC Triage
              aboyko Alexander Boyko
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: