[PATCH] Using Intel CRC32 instruction to accelerate CRC32c algorithm by new crypto API -V3.

Previous thread: Linux 2.6.27-rc2 by Linus Torvalds on Wednesday, August 6, 2008 - 1:14 am. (1 message)

Next thread: Re: [patch 12/17] vfs: pagecache usage optimization for pagesize!=blocksize by Nick Piggin on Wednesday, August 6, 2008 - 1:36 am. (1 message)
To: <herbert@...>, <bunk@...>
Cc: <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Wednesday, August 6, 2008 - 1:23 am

Revised by comments:
Add 'static' for limitation namespace;
Resend for fixing lines-folded by adjusting evolution config;
CRC32c algorithm with the new CRC32 instruction in SSE 4.2 instruction set.
The patch detects the availability of the feature, and chooses the most proper
way to calculate CRC32c checksum.
Byte code instructions are used for compiler compatibility.
No MMX / XMM registers is involved in the implementation.

Signed-off-by: Austin Zhang <austin.zhang@intel.com>
Signed-off-by: Kent Liu <kent.liu@intel.com>
---
arch/x86/crypto/Makefile | 2
arch/x86/crypto/crc32c-intel.c | 190 +++++++++++++++++++++++++++++++++++++++++
crypto/Kconfig | 12 ++
include/asm-x86/cpufeature.h | 2
4 files changed, 206 insertions(+)

diff -Naurp linux-2.6/arch/x86/crypto/crc32c-intel.c linux-2.6-patch/arch/x86/crypto/crc32c-intel.c
--- linux-2.6/arch/x86/crypto/crc32c-intel.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-patch/arch/x86/crypto/crc32c-intel.c 2008-08-05 21:57:37.000000000 -0400
@@ -0,0 +1,190 @@
+/*
+ * Using hardware provided CRC32 instruction to accelerate the CRC32 disposal.
+ * CRC32C polynomial:0x1EDC6F41(BE)/0x82F63B78(LE)
+ * CRC32 is a new instruction in Intel SSE4.2, the reference can be found at:
+ * http://www.intel.com/products/processor/manuals/
+ * Intel(R) 64 and IA-32 Architectures Software Developer's Manual
+ * Volume 2A: Instruction Set Reference, A-M
+ */
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/string.h>
+#include <linux/kernel.h>
+#include <crypto/internal/hash.h>
+
+#include <asm/cpufeature.h>
+
+#define CHKSUM_BLOCK_SIZE 1
+#define CHKSUM_DIGEST_SIZE 4
+
+#ifdef CONFIG_X86_64
+#define REX_PRE "0x48, "
+#define SCALE_F 8
+#else
+#define REX_PRE
+#define SCALE_F 4
+#endif
+
+static u32 crc32c_intel_le_hw_byte(u32 crc, unsigned char const *data, size_t length)
+{
+ while (length--) {
+ __asm__ __volatile__(
+ ".b...

To: Austin Zhang <austin_zhang@...>
Cc: <herbert@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Thursday, August 7, 2008 - 11:38 pm

On Tue, Aug 5, 2008 at 10:23 PM, Austin Zhang

I think you want to use

#define SCALE_F sizeof(unsigned long)

Since the loop iteration count etc depends on

ptmp++

which depends on the type being unsigned long.
--

To: Ulrich Drepper <drepper@...>
Cc: <austin_zhang@...>, <herbert@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Friday, August 8, 2008 - 9:35 am

Yeah in general that's what we should do. However, this driver
is specific to Intel x86 CPUs.

However if someone were to post a patch to do this I would happily
apply it.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--

To: Herbert Xu <herbert@...>
Cc: Ulrich Drepper <drepper@...>, <austin_zhang@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Monday, August 11, 2008 - 12:10 pm

I thought we support intel x86 cpus in both 32 and 64bit modes...?

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: Pavel Machek <pavel@...>
Cc: <herbert@...>, <drepper@...>, <austin_zhang@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Tuesday, August 12, 2008 - 9:14 pm

Yes we do, but the original patch had ugly ifdefs that did the
same thing.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--

To: Austin Zhang <austin_zhang@...>
Cc: <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Wednesday, August 6, 2008 - 6:09 am

Applied to cryptodev-2.6. Thanks a lot Austin!
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--

To: Austin Zhang <austin_zhang@...>
Cc: <herbert@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Wednesday, August 6, 2008 - 5:42 am

To: Pavel Machek <pavel@...>
Cc: <herbert@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Wednesday, August 6, 2008 - 7:05 am

Paval,

Thanks for your comments.

Are you suggest "ENODEV"? It's a feature from the device but the device is exact here.
And for the crc32c algorithm, there would be possible that several
algorithms registered themselves in crypto and user will don't care
which implementation will server him even the hardware accelerated
implementation don't exist in this processor.

--

To: Austin Zhang <austin_zhang@...>
Cc: <herbert@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Wednesday, August 6, 2008 - 8:14 am

Well, it should normally go to comment at the beggining of file.

Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: Pavel Machek <pavel@...>
Cc: Austin Zhang <austin_zhang@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Wednesday, August 6, 2008 - 10:00 pm

I've made the following change.

Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/arch/x86/crypto/crc32c-intel.c b/arch/x86/crypto/crc32c-intel.c
index 6cd20c5..c0e1e6b 100644
--- a/arch/x86/crypto/crc32c-intel.c
+++ b/arch/x86/crypto/crc32c-intel.c
@@ -5,6 +5,15 @@
* http://www.intel.com/products/processor/manuals/
* Intel(R) 64 and IA-32 Architectures Software Developer's Manual
* Volume 2A: Instruction Set Reference, A-M
+ *
+ * Copyright (c) 2008 Austin Zhang <austin_zhang@linux.intel.com>
+ * Copyright (c) 2008 Kent Liu <kent.liu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
*/
#include <linux/init.h>
#include <linux/module.h>
--

To: Austin Zhang <austin_zhang@...>
Cc: Pavel Machek <pavel@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Wednesday, August 6, 2008 - 7:17 am

Yes I think this should be ENODEV to be consistent with the
existing drivers such as padlock-aes.c.

I'll make that change in cryptodev.

Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--

To: Austin Zhang <austin_zhang@...>
Cc: Pavel Machek <pavel@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Wednesday, August 6, 2008 - 7:22 am

In fact I'm going to kill that printk altogether since the fact
that this feature doesn't exist isn't an error.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--

To: Herbert Xu <herbert@...>
Cc: Pavel Machek <pavel@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Wednesday, August 6, 2008 - 7:20 am

Thank you, Pavel and Herbert.

--

To: Pavel Machek <pavel@...>
Cc: Austin Zhang <austin_zhang@...>, <bunk@...>, <dwmw2@...>, <davem@...>, <randy.dunlap@...>, <linux-kernel@...>, <linux-crypto@...>
Date: Wednesday, August 6, 2008 - 7:03 am

Unfortunately I don't think le32 exists, it'd definitely be nice
to have it though.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--

Previous thread: Linux 2.6.27-rc2 by Linus Torvalds on Wednesday, August 6, 2008 - 1:14 am. (1 message)

Next thread: Re: [patch 12/17] vfs: pagecache usage optimization for pagesize!=blocksize by Nick Piggin on Wednesday, August 6, 2008 - 1:36 am. (1 message)