[LU-1201] Lustre crypto hash cleanup - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.3.0
Affects Version/s: None
Labels:
None

Rank (Obsolete):
4572

Description

Lustre use crypto hashes algo in two ways: PTLRPC (ptlrpc/sec_bulk.c), OST(ost/ost_handler.c)/OSC (osc/osc_request.c).
OST,OSC use crc32, crc32c, adler for checksumming (compute_checksum() function)
PTLRPC uses crc32, adler, md5, sha1-512 ( kernel crypto api for all, excluding crc32 adler)
All subsystems go through bulk pages and update checksum.
To resolve conflicts with different implementation of checksumming, a new crypto hash interface is needed at libcfs. It should use kernel crypto api for hash calculation for kernel modules, and lustre implementation for user mode. Previus checksum calculation should be changed to the new libcfs crypto hash api. And adding new hash algo would be a simple task.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

lustre-LU1201.xlsx
48 kB
13/May/12 12:13 AM
lustre-singleclient-comparison.xlsx
55 kB
07/May/12 1:39 AM

Issue Links

is related to

LU-744 Single client's performance degradation on 2.1

Resolved

Activity

[LU-1201] Lustre crypto hash cleanup

Alexander Boyko added a comment - 05/May/12 5:19 PM - edited

Can you attach your config.h? Looks like invalid configure, I can`t reproduce issue on the same kernel.

Alexander Boyko added a comment - 05/May/12 5:19 PM - edited Can you attach your config.h? Looks like invalid configure, I can`t reproduce issue on the same kernel.

Shuichi Ihara (Inactive) added a comment - 05/May/12 11:12 AM

Hit kernel panic with this patches on Sandybridge server. I just filed on ~~LU-1379~~.

Shuichi Ihara (Inactive) added a comment - 05/May/12 11:12 AM Hit kernel panic with this patches on Sandybridge server. I just filed on LU-1379 .

Shuichi Ihara (Inactive) added a comment - 01/May/12 4:39 AM

I had couple of benchmarks on 1.8 and 2.2 for comparison, but not yet with this patches. I'm thinking to do same benchmarking on 1.8 and 2.2 with this patches in a couple of days.
btw, we are seeing single client's performance regressions on 2.x due to ~~LU-744~~. So, at this moment, we might need to run with file size < client's memory size for more fair comparison.. However, In the case of file size > client's memory size, we still see less client's performance regardless checksum=on/off.

Shuichi Ihara (Inactive) added a comment - 01/May/12 4:39 AM I had couple of benchmarks on 1.8 and 2.2 for comparison, but not yet with this patches. I'm thinking to do same benchmarking on 1.8 and 2.2 with this patches in a couple of days. btw, we are seeing single client's performance regressions on 2.x due to LU-744 . So, at this moment, we might need to run with file size < client's memory size for more fair comparison.. However, In the case of file size > client's memory size, we still see less client's performance regardless checksum=on/off.

Andreas Dilger added a comment - 30/Apr/12 4:34 PM

Alexander, Ihara,
have you done any performance comparison with checksums enabled on 1.8 for comparison?

Andreas Dilger added a comment - 30/Apr/12 4:34 PM Alexander, Ihara, have you done any performance comparison with checksums enabled on 1.8 for comparison?

Alexander Boyko added a comment - 22/Apr/12 5:40 AM

I do test on RH5 with kernel version 2.6.18-238.19.1, it support crc32c hw (kernel compile), and have better crc32c performance than 2.6.32-... kernels.

Alexander Boyko added a comment - 22/Apr/12 5:40 AM I do test on RH5 with kernel version 2.6.18-238.19.1, it support crc32c hw (kernel compile), and have better crc32c performance than 2.6.32-... kernels.

Andreas Dilger added a comment - 20/Apr/12 11:11 PM

Ihara, while it is true that there may be some minor degradation in the case of RHEL 5 clients, based on the test results you posted this performance loss will be minor (or still an improvement over earlier versions of Lustre) due to the multi-threaded ptlrpcd speedups.

Also, the chance of users wanting to stick with RHEL5 for stanility, but moving to a new development version of Lustre is not very likely.

Andreas Dilger added a comment - 20/Apr/12 11:11 PM Ihara, while it is true that there may be some minor degradation in the case of RHEL 5 clients, based on the test results you posted this performance loss will be minor (or still an improvement over earlier versions of Lustre) due to the multi-threaded ptlrpcd speedups. Also, the chance of users wanting to stick with RHEL5 for stanility, but moving to a new development version of Lustre is not very likely.

Shuichi Ihara (Inactive) added a comment - 20/Apr/12 10:28 PM

A quick question.
Do we keep RHE5 patchless client on lustre-2.3?
If so, RHEL5's client may see some performance drops on 2.3 from 2.2 if they upgrade.
Because, crc32c is implemented in the lustre itself and crc32c-intel is not available on that kernel, so alder will be selected on 2.3.

Shuichi Ihara (Inactive) added a comment - 20/Apr/12 10:28 PM A quick question. Do we keep RHE5 patchless client on lustre-2.3? If so, RHEL5's client may see some performance drops on 2.3 from 2.2 if they upgrade. Because, crc32c is implemented in the lustre itself and crc32c-intel is not available on that kernel, so alder will be selected on 2.3.

Nathan Rutman added a comment - 04/Apr/12 7:27 PM - edited

That sounds like a good solution:
1. Client sends all available local algos
2. server drops non-client-supported from list, then replies with remaining algos that perform at 50% or better of the fastest algo remaining on server
3. client chooses locally fastest of what's left as the default.

Nathan Rutman added a comment - 04/Apr/12 7:27 PM - edited That sounds like a good solution: 1. Client sends all available local algos 2. server drops non-client-supported from list, then replies with remaining algos that perform at 50% or better of the fastest algo remaining on server 3. client chooses locally fastest of what's left as the default.

Andreas Dilger added a comment - 04/Apr/12 2:38 PM - edited

If the concern is the client eliminating the checksum algorithm choices prematurely, then the client could send all of the known algorithms to the OST. Only the OST would drop the algorithms that are totally underperforming (e.g. crc32 or crc32c in s/w). The final performance/config selection would be in the hands of the client/administrator. That would ensure that the client doesn't pick a locally fast algorithm (e.g. crc32c h/w) that would significantly impact the OST CPU usage (e.g. crc32c s/w), without reducing the (sensible) choices available to the user.

Andreas Dilger added a comment - 04/Apr/12 2:38 PM - edited If the concern is the client eliminating the checksum algorithm choices prematurely, then the client could send all of the known algorithms to the OST. Only the OST would drop the algorithms that are totally underperforming (e.g. crc32 or crc32c in s/w). The final performance/config selection would be in the hands of the client/administrator. That would ensure that the client doesn't pick a locally fast algorithm (e.g. crc32c h/w) that would significantly impact the OST CPU usage (e.g. crc32c s/w), without reducing the (sensible) choices available to the user.

Nathan Rutman added a comment - 04/Apr/12 2:00 PM

I guess the caveat is that if the algorithm is excluded because it is much slower than the others, then it may not be available for selection. That is true of the code today also - if crc32c is not in hardware on either the client or server, then it is excluded and couldn't be selected by the administrator either.

Exactly my concern. If there are tons of clients and few servers, it may make sense to use the fastest server algorithm regardless of the client algo speed to optimize the overall system rate. That's the reason we wanted to be able to use the SW crc-32c algo (on the clients) in the first place.

Nathan Rutman added a comment - 04/Apr/12 2:00 PM I guess the caveat is that if the algorithm is excluded because it is much slower than the others, then it may not be available for selection. That is true of the code today also - if crc32c is not in hardware on either the client or server, then it is excluded and couldn't be selected by the administrator either. Exactly my concern. If there are tons of clients and few servers, it may make sense to use the fastest server algorithm regardless of the client algo speed to optimize the overall system rate. That's the reason we wanted to be able to use the SW crc-32c algo (on the clients) in the first place.

Shuichi Ihara (Inactive) added a comment - 04/Apr/12 11:36 AM

I've never tested this patches on 2.2. All results were on vanilla-2.2 branch.
Here is single client's performance ratio(tested with 12 IOzone threads). This is tested on real disks, instead of ramdisk. The results overall, it seems to be reasonable.

Performance ratio(Write)
            v2.2
nochecksum  1.00
adler       0.88
crc32       0.68
crc32c      0.93

Performance ratio(Read)
            v2.2
nochecksum  1.00
adler       0.88
crc32       0.71
crc32c      0.91

Shuichi Ihara (Inactive) added a comment - 04/Apr/12 11:36 AM I've never tested this patches on 2.2. All results were on vanilla-2.2 branch. Here is single client's performance ratio(tested with 12 IOzone threads). This is tested on real disks, instead of ramdisk. The results overall, it seems to be reasonable. Performance ratio(Write) v2.2 nochecksum 1.00 adler 0.88 crc32 0.68 crc32c 0.93 Performance ratio(Read) v2.2 nochecksum 1.00 adler 0.88 crc32 0.71 crc32c 0.91

People

Assignee:: WC Triage

Reporter:: Alexander Boyko

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 09/Mar/12 10:48 AM

Updated:: 21/Nov/12 7:30 PM

Resolved:: 01/Jul/12 3:00 AM