[LU-241] support crc32c with hardware accelerated instruction as one of lustre checksums Created: 26/Apr/11  Updated: 27/Jan/12  Resolved: 13/Oct/11

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.2.0

Type: Improvement Priority: Minor
Reporter: Shuichi Ihara (Inactive) Assignee: Peter Jones
Resolution: Fixed Votes: 0
Labels: None

Attachments: Text File b-23549-allow-crc32c-hardware-support-in-lustre-2.patch    
Issue Links:
Related
is related to LU-1025 Lustre crc32c implementation not use ... Resolved
Bugzilla ID: 23,549
Rank (Obsolete): 4894

 Description   

The current lustre codes is an limitation that is single client's performance (buffered I/O) when the checksum is turned on. My understanding is that the buffered I/O is handled by ptlrpcd on the client and the lustre checksum is also calculated in this thread if the client read the data from OSSs. ptlrpcd is not multithreaded in current codes, the checksum calculation harms CPU resources and impacts the lustre performance.

The other hand, the latest Intel Nehalem/Westmere CPUs have the hardware crc32c accelerated instruction function and it's implemented in the CPU chip. we can use fast crc32c instruction without any additional costs if server is running with Intel CPUs.

The current lustre supports crc, alder as checksum algorithm. So, I would suggest adding crc32c as one of additional checksum algorithm and enable

See bz#23549 on bugzilla. The initial patch is available and simple testing is done.

The performance was much improved.
e.g)
single client's read performance : 60% improved
1GB/sec (adler) vs 1.6GB/sec (crc32c/w hardware instruction)

single client's write performance : (max) 30% improved
1.3GB/sec (adler) vs 1.7GB/sec (crc32c/w hardware instruction)

see more detail : https://bugzilla.lustre.org/attachment.cgi?id=31604

And I saw this patch can reduce the CPU usages too.

However, see also bz#23771, we saw some some error "checksum protocl errors" sometimes. (not always, and we still don't know when (what's timing) this error shows up.

Done someone have a look at codes in patch and give me some suggestions to fix or how figure out bug 23771?

I wonder if we can land this patch into the lustre mainstream once we can fix 23771.



 Comments   
Comment by Peter Jones [ 26/Apr/11 ]

Thanks for opening a ticket for this Ihara - we do not want to lose this useful work in progress

Comment by Shuichi Ihara (Inactive) [ 08/Jun/11 ]

fixed patch against bug 23549.

Comment by Peter Jones [ 15/Jun/11 ]

Ihara do you mean that you will be uploading a patch into gerrit?

Comment by Shuichi Ihara (Inactive) [ 06/Jul/11 ]

I've done to submit patch set to master and b1_8 branch, and review in progress.

for master branch
http://review.whamcloud.com/#change,1009

for b1_8 branch
http://review.whamcloud.com/#change,960

Comment by Shuichi Ihara (Inactive) [ 31/Aug/11 ]

Maloo is always failing due to Node provisioning timed out...
it can't start sanity testing for new codes.

can someone have a look at this Maloo errors?

https://maloo.whamcloud.com/test_sets/4763d8b4-d394-11e0-8d02-52540025f9af

Comment by Jian Yu [ 31/Aug/11 ]

Hello Ihara,

https://maloo.whamcloud.com/test_sets/4763d8b4-d394-11e0-8d02-52540025f9af

From the console log of client node fat-intel-1vm1, we could see:

┌──────────────────────────┤ Missing Package ├───────────────────────────┐
You have specified that the package 'kernel-2.6.32-131.2.1.el6.x86_64' should be installed.
This package does not exist. Would you like to continue or abort this installation?
Abort │ Ignore All │ Continue

For master branch, the kernel version for RHEL6 has been updated to '2.6.32-131.6.1.el6.x86_64'. For b1_8 branch, it's still '2.6.32-131.2.1.el6.x86_64'.

Since the above node provisioning failure occurred while verifying the patches for master branch, could you please rebase your patches against the latest master branch and re-submit them to Gerrit to trigger a new build and avoid the above provision failure?

For b1_8 branch, the issue still exists. I think Chris and Mike would find a way to fix that.

Comment by Shuichi Ihara (Inactive) [ 31/Aug/11 ]

Hi Yu Jian,

Can you kick rebuilding new RPMSs manually against re-submit patches? I can't start rebulding RPM manually, but you probabry do that.

Thanks
Ihara

Comment by Jian Yu [ 31/Aug/11 ]

Hi Ihara,

Can you kick rebuilding new RPMSs manually against re-submit patches? I can't start rebulding RPM manually, but you probabry do that.

After you re-submit the patches, the building would be triggered automatically, and after the new RPMs are built successfully, the autotest system would also be triggered automatically to verify the patches.

Comment by Shuichi Ihara (Inactive) [ 31/Aug/11 ]

Yes, I know, but please see below adilger kicked rebuild and autotesitng without re-submit patches.

http://build.whamcloud.com/job/lustre-reviews/1916/

I don't think we really need re-submit the patches to kick jenkins manually for new RPMSs builds. Once jenkins finishes to builld RPMs, Maloo will start autotesting.. that is my understanding.

Comment by Jian Yu [ 31/Aug/11 ]

Hi Ihara,

I meant you need rebase the patches against the latest master branch first and then re-submit them on Gerrit.

From http://review.whamcloud.com/#change,1009 we could see the patch set 6 was based on commit a832ab57bda8658457193cc670b72a9995f10ff0 (LU-432), which was landed on 2011-06-24. However, after that, the default kernel version for RHEL6 on master branch was updated to '2.6.32-131.6.1.el6.x86_64' on 2011-07-29. So, if I just re-trigger the build on the current patch set 6, the kernel version for RHEL6 would still be the old one '2.6.32-131.2.1.el6.x86_64'.

Comment by Shuichi Ihara (Inactive) [ 31/Aug/11 ]

Hi Yu Jian,

Many thanks!
I got it what you said, and I just rebaced my work branch and re-submited patches on Gerrit.
Jenkins started to build new RPMs. I hope Maloo will run autotesting with new RPMs.

Thanks
Ihara

Comment by Jian Yu [ 01/Sep/11 ]

You're welcome, Ihara!
Currently, there are about 8 patch sets in the autotest queue. It would be scheduled soon, I think.

Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » x86_64,server,el5,ofa #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/utils/wiretest.c
  • lustre/ptlrpc/wiretest.c
  • lustre/include/obd_cksum.h
  • lustre/llite/llite_lib.c
  • lustre/ptlrpc/import.c
  • lustre/tests/sanity.sh
  • lustre/mds/mds_lov.c
  • lustre/utils/wirecheck.c
  • lustre/include/linux/obd_support.h
  • lustre/osc/osc_request.c
  • lustre/ost/ost_handler.c
  • lustre/obdfilter/filter.c
  • lustre/include/lustre/lustre_idl.h
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » x86_64,client,el6,inkernel #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/osc/osc_request.c
  • lustre/ptlrpc/wiretest.c
  • lustre/ost/ost_handler.c
  • lustre/obdfilter/filter.c
  • lustre/include/linux/obd_support.h
  • lustre/include/lustre/lustre_idl.h
  • lustre/llite/llite_lib.c
  • lustre/utils/wirecheck.c
  • lustre/tests/sanity.sh
  • lustre/ptlrpc/import.c
  • lustre/utils/wiretest.c
  • lustre/include/obd_cksum.h
  • lustre/mds/mds_lov.c
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » x86_64,server,el5,inkernel #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/ost/ost_handler.c
  • lustre/osc/osc_request.c
  • lustre/tests/sanity.sh
  • lustre/utils/wirecheck.c
  • lustre/include/obd_cksum.h
  • lustre/llite/llite_lib.c
  • lustre/ptlrpc/import.c
  • lustre/include/linux/obd_support.h
  • lustre/obdfilter/filter.c
  • lustre/ptlrpc/wiretest.c
  • lustre/include/lustre/lustre_idl.h
  • lustre/mds/mds_lov.c
  • lustre/utils/wiretest.c
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » i686,server,el6,inkernel #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/utils/wiretest.c
  • lustre/include/linux/obd_support.h
  • lustre/llite/llite_lib.c
  • lustre/osc/osc_request.c
  • lustre/ptlrpc/wiretest.c
  • lustre/ost/ost_handler.c
  • lustre/include/obd_cksum.h
  • lustre/obdfilter/filter.c
  • lustre/ptlrpc/import.c
  • lustre/tests/sanity.sh
  • lustre/include/lustre/lustre_idl.h
  • lustre/utils/wirecheck.c
  • lustre/mds/mds_lov.c
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » x86_64,client,sles11,inkernel #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/ptlrpc/import.c
  • lustre/include/linux/obd_support.h
  • lustre/include/obd_cksum.h
  • lustre/include/lustre/lustre_idl.h
  • lustre/mds/mds_lov.c
  • lustre/ost/ost_handler.c
  • lustre/llite/llite_lib.c
  • lustre/obdfilter/filter.c
  • lustre/utils/wirecheck.c
  • lustre/ptlrpc/wiretest.c
  • lustre/utils/wiretest.c
  • lustre/tests/sanity.sh
  • lustre/osc/osc_request.c
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » x86_64,client,el5,ofa #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/mds/mds_lov.c
  • lustre/include/obd_cksum.h
  • lustre/ptlrpc/wiretest.c
  • lustre/utils/wiretest.c
  • lustre/ptlrpc/import.c
  • lustre/ost/ost_handler.c
  • lustre/include/linux/obd_support.h
  • lustre/llite/llite_lib.c
  • lustre/utils/wirecheck.c
  • lustre/obdfilter/filter.c
  • lustre/osc/osc_request.c
  • lustre/include/lustre/lustre_idl.h
  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » x86_64,client,el5,inkernel #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/ptlrpc/wiretest.c
  • lustre/tests/sanity.sh
  • lustre/osc/osc_request.c
  • lustre/obdfilter/filter.c
  • lustre/llite/llite_lib.c
  • lustre/utils/wiretest.c
  • lustre/mds/mds_lov.c
  • lustre/include/lustre/lustre_idl.h
  • lustre/utils/wirecheck.c
  • lustre/include/linux/obd_support.h
  • lustre/ptlrpc/import.c
  • lustre/ost/ost_handler.c
  • lustre/include/obd_cksum.h
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » i686,client,el6,inkernel #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/llite/llite_lib.c
  • lustre/tests/sanity.sh
  • lustre/ptlrpc/wiretest.c
  • lustre/include/linux/obd_support.h
  • lustre/osc/osc_request.c
  • lustre/ost/ost_handler.c
  • lustre/mds/mds_lov.c
  • lustre/obdfilter/filter.c
  • lustre/utils/wiretest.c
  • lustre/utils/wirecheck.c
  • lustre/ptlrpc/import.c
  • lustre/include/lustre/lustre_idl.h
  • lustre/include/obd_cksum.h
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/llite/llite_lib.c
  • lustre/tests/sanity.sh
  • lustre/ptlrpc/wiretest.c
  • lustre/ptlrpc/import.c
  • lustre/obdfilter/filter.c
  • lustre/mds/mds_lov.c
  • lustre/osc/osc_request.c
  • lustre/include/linux/obd_support.h
  • lustre/ost/ost_handler.c
  • lustre/include/obd_cksum.h
  • lustre/include/lustre/lustre_idl.h
  • lustre/utils/wirecheck.c
  • lustre/utils/wiretest.c
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » x86_64,server,el6,inkernel #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/llite/llite_lib.c
  • lustre/osc/osc_request.c
  • lustre/ost/ost_handler.c
  • lustre/utils/wirecheck.c
  • lustre/ptlrpc/wiretest.c
  • lustre/utils/wiretest.c
  • lustre/include/lustre/lustre_idl.h
  • lustre/include/obd_cksum.h
  • lustre/mds/mds_lov.c
  • lustre/include/linux/obd_support.h
  • lustre/obdfilter/filter.c
  • lustre/ptlrpc/import.c
  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » i686,server,el5,inkernel #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/include/lustre/lustre_idl.h
  • lustre/utils/wiretest.c
  • lustre/ost/ost_handler.c
  • lustre/obdfilter/filter.c
  • lustre/ptlrpc/wiretest.c
  • lustre/llite/llite_lib.c
  • lustre/osc/osc_request.c
  • lustre/tests/sanity.sh
  • lustre/include/linux/obd_support.h
  • lustre/ptlrpc/import.c
  • lustre/utils/wirecheck.c
  • lustre/include/obd_cksum.h
  • lustre/mds/mds_lov.c
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » i686,client,el5,ofa #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/mds/mds_lov.c
  • lustre/include/linux/obd_support.h
  • lustre/ptlrpc/import.c
  • lustre/utils/wiretest.c
  • lustre/ptlrpc/wiretest.c
  • lustre/utils/wirecheck.c
  • lustre/include/obd_cksum.h
  • lustre/tests/sanity.sh
  • lustre/ost/ost_handler.c
  • lustre/include/lustre/lustre_idl.h
  • lustre/llite/llite_lib.c
  • lustre/osc/osc_request.c
  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » i686,client,el5,inkernel #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/mds/mds_lov.c
  • lustre/utils/wirecheck.c
  • lustre/utils/wiretest.c
  • lustre/obdfilter/filter.c
  • lustre/ptlrpc/wiretest.c
  • lustre/tests/sanity.sh
  • lustre/include/obd_cksum.h
  • lustre/ptlrpc/import.c
  • lustre/include/linux/obd_support.h
  • lustre/ost/ost_handler.c
  • lustre/osc/osc_request.c
  • lustre/include/lustre/lustre_idl.h
  • lustre/llite/llite_lib.c
Comment by Build Master (Inactive) [ 05/Oct/11 ]

Integrated in lustre-master » i686,server,el5,ofa #287
LU-241 Support crc32c with hardware accelerated instruction as one of lustre checksums

Oleg Drokin : 0517160dd68ac026513ad1b8e3e6f7abd4acfdef
Files :

  • lustre/obdfilter/filter.c
  • lustre/utils/wirecheck.c
  • lustre/llite/llite_lib.c
  • lustre/ptlrpc/wiretest.c
  • lustre/utils/wiretest.c
  • lustre/mds/mds_lov.c
  • lustre/ptlrpc/import.c
  • lustre/include/linux/obd_support.h
  • lustre/osc/osc_request.c
  • lustre/ost/ost_handler.c
  • lustre/include/obd_cksum.h
  • lustre/include/lustre/lustre_idl.h
  • lustre/tests/sanity.sh
Comment by Peter Jones [ 13/Oct/11 ]

Landed for 2.2. Unlikely to consider landing this to 1.8.x at this point

Comment by Andreas Dilger [ 27/Jan/12 ]

LU-1025 has a fix for the CRC32C algorithm that was introduced by this change. This will change the wire protocol.

Are there any users of this feature in production? Since the CRC32C patch was landed after 2.1.0 was released, we are planning to land the LU-1025 change for 2.2.0 since it will not change the protocol for any released version, but if anyone has pulled this patch into another release their systems will break if they do not upgrade both the client and server at the same time.

Generated at Sat Feb 10 01:05:08 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.