[LU-2488] crc t10 dif pclmulqdq implementation Created: 13/Dec/12  Updated: 26/Sep/13  Resolved: 26/Sep/13

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Minor
Reporter: Alexander Boyko Assignee: Keith Mannthey (Inactive)
Resolution: Incomplete Votes: 0
Labels: patch

Issue Links:
Related
is related to LU-2584 End-to-End Data Integrity with T10 Resolved
Rank (Obsolete): 5841

 Comments   
Comment by Alexander Boyko [ 13/Dec/12 ]

This patch adds crc t10 dif pclmulqdq implementation to libcfs. Result from t10 unit test:

Lustre: t10crc UT
Lustre: Checking block size 512 loops 4096 ...
Lustre: Speed: libcfs 1351MB/s	 kernel 259MB/s
Lustre: PASS
Lustre: Checking block size 4096 loops 4096 ...
Lustre: Speed: libcfs 2111MB/s	 kernel 251MB/s
Lustre: PASS

kernel - linux kernel table implementation.
libcfs - pclmulqdq implementation.

http://review.whamcloud.com/4822

Comment by Andreas Dilger [ 13/Dec/12 ]

Presumably the intent of this is to add support for a t10 DIF mode for bulk RPCs? It would be good to include some background in the bug about how this code is planned to be used.

Comment by Alexander Boyko [ 13/Dec/12 ]

This code will be used by client to calculate crc t10 dif, and provide this checksum to server with t10 dif/dix capable storage.

Comment by Andreas Dilger [ 21/Dec/12 ]

So presumably you are going to add a new T10 bulk data checksum mode to the BRW protocol? I recall seeing some discussion to this effect, but I don't recall if there was a conclusion about what method would be used to send all of the T10 information in the RPC? This would increase the size of the BRW RPC by 256 * 8 bytes = 2kB for a 1MB RPC, and by 8kB for a 4MB RPC, which is pretty significant.

Do you have any kind of HLD for the T10 DIF changes to the RPC?

Comment by Keith Mannthey (Inactive) [ 04/Jan/13 ]

A good amount of kernels have T10 support at this point. Is the kernels T10 implementation not usable for your needs?

Comment by Alexander Boyko [ 05/Jan/13 ]

Keith, if you are talking about crc T10, the kernel implementation is very slow and take much cpu resources.

Andreas, you are right size of BRW RPC will be increased by sizeof(crc) * number of sectors for the bulk.

Comment by Nathan Rutman [ 07/Jan/13 ]

Xyratex MRP-511

Comment by Nathan Rutman [ 07/Jan/13 ]

While this particular patch provides hardware support for the T10-DIF CRC algorithm for general use in the libcfs library, and is not concerned with any actual T10-DIF/DIX usage, I understand the reluctance to use apparently unnecessary code.
This is the first patch in a series designed to provide end-to-end T10-DIX support for Lustre clients; I will open another, high-level ticket describing it.

Comment by Nathan Rutman [ 07/Jan/13 ]

LU-2584 is parent ticket.

Comment by Keith Mannthey (Inactive) [ 26/Apr/13 ]

It is possible to accelerate the Kernels T10 rather than re-implement the T10 framework in Lustre?

Do we know why the T10 is not ASM accelerated in the kernel or the status of mainline in this regard?

Comment by Alexander Boyko [ 26/Apr/13 ]

Keith, it will take time. The current kernel version is 3.9, but Lustre base on 2.6.32-279. May be I need to create and push t10 code to kernel in parallel. The T10 for Lustre has specific restrictions - only 512 and 4096 sector size and aligned data, using this the code is more productive than the general implementation.

Comment by Keith Mannthey (Inactive) [ 21/Jun/13 ]

It seem the feature freeze for 2.5 is the end of July. It would be good to refresh this patch if the T10 code is targeted for the 2.5 release.

Comment by Alexander Boyko [ 28/Aug/13 ]

Pleas, close this issue. This feature is not needed without main T10 code, I don`t see any progress on T10, and I will not spend time to update this patch.

Comment by Andreas Dilger [ 26/Sep/13 ]

Close bug per request.

Generated at Sat Feb 10 01:25:38 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.