Re: Kernel oops with 2.6.26, padlock and ipsec: probably problem with fpu state changes

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Suresh Siddha <suresh.b.siddha@...>
Cc: Herbert Xu <herbert@...>, H. Peter Anvin <hpa@...>, netdev@vger.kernel.org <netdev@...>, linux-kernel@vger.kernel.org <linux-kernel@...>, Ingo Molnar <mingo@...>, viro@ZenIV.linux.org.uk <viro@...>, vegard.nossum@gmail.com <vegard.nossum@...>
Date: Tuesday, August 12, 2008 - 7:43 am

On Monday 11 August 2008, Suresh Siddha wrote:

* Works fine, machine is up since 61 minutes.

* Performance:

Routing performance over esp-tunnels seems unchanged here compared to 2.6.25
(this was also the case with the "kernel_fpu_begin" patch).

tcrypt mode=200 shows exactly the same performance penalty compared to 2.6.25
as the "kernel_fpu_begin" patch.

But I think this the right way to go with 2.6.26 und probably 2.6.27. And I'm
not sure if tcyrpt really shows the whole story for 2.6.25:

a) does it measure the costs of the unecessary FXSAVE and FXRSTOR ?
b) does it measure the clts() and stts() which will happen any way though not
in padlock-*.c itself but in __switch_to() and math_state_restore() ?

So shouldn'tthis patch make  - in the whole - performance better compared to
2.6.25 (because its avoids FXSAVE and FXRSTOR for tasks which do not use
FPU/SSE/... in userspace)?


Here the results for tcrypt mode=200:

===============================
testing speed of ecb(aes) encryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 763 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 740 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 860 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1340 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 6583 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 1542 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 1614 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 1950 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 3294 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 18214 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 753 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 781 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 949 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1621 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 8658 cycles (8192 bytes)

testing speed of ecb(aes) decryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 727 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 742 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 862 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1342 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 6621 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 1548 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 1614 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 1950 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 3294 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 18251 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 759 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 783 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 951 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1623 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 8665 cycles (8192 bytes)

testing speed of cbc(aes) encryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 759 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 816 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 1088 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 2144 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 12796 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 1571 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 1694 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 2198 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 4214 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 25420 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 791 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 877 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 1235 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 2675 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 16912 cycles (8192 bytes)

testing speed of cbc(aes) decryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 740 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 795 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 1058 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 2114 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 12726 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 1548 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 1670 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 2174 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 4190 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 25349 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 763 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 856 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 1214 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 2654 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 16846 cycles (8192 bytes)

testing speed of lrw(aes) encryption
test 0 (256 bit key, 16 byte blocks): 1 operation in 1402 cycles (16 bytes)
test 1 (256 bit key, 64 byte blocks): 1 operation in 2653 cycles (64 bytes)
test 2 (256 bit key, 256 byte blocks): 1 operation in 7576 cycles (256 bytes)
test 3 (256 bit key, 1024 byte blocks): 1 operation in 26990 cycles (1024 bytes)
test 4 (256 bit key, 8192 byte blocks): 1 operation in 209207 cycles (8192 bytes)
test 5 (320 bit key, 16 byte blocks): 1 operation in 2229 cycles (16 bytes)
test 6 (320 bit key, 64 byte blocks): 1 operation in 3730 cycles (64 bytes)
test 7 (320 bit key, 256 byte blocks): 1 operation in 9179 cycles (256 bytes)
test 8 (320 bit key, 1024 byte blocks): 1 operation in 31493 cycles (1024 bytes)
test 9 (320 bit key, 8192 byte blocks): 1 operation in 239349 cycles (8192 bytes)
test 10 (384 bit key, 16 byte blocks): 1 operation in 1435 cycles (16 bytes)
test 11 (384 bit key, 64 byte blocks): 1 operation in 2809 cycles (64 bytes)
test 12 (384 bit key, 256 byte blocks): 1 operation in 8211 cycles (256 bytes)
test 13 (384 bit key, 1024 byte blocks): 1 operation in 29425 cycles (1024 bytes)
test 14 (384 bit key, 8192 byte blocks): 1 operation in 228659 cycles (8192 bytes)

testing speed of lrw(aes) decryption
test 0 (256 bit key, 16 byte blocks): 1 operation in 1396 cycles (16 bytes)
test 1 (256 bit key, 64 byte blocks): 1 operation in 2654 cycles (64 bytes)
test 2 (256 bit key, 256 byte blocks): 1 operation in 7577 cycles (256 bytes)
test 3 (256 bit key, 1024 byte blocks): 1 operation in 27001 cycles (1024 bytes)
test 4 (256 bit key, 8192 byte blocks): 1 operation in 209225 cycles (8192 bytes)
test 5 (320 bit key, 16 byte blocks): 1 operation in 2232 cycles (16 bytes)
test 6 (320 bit key, 64 byte blocks): 1 operation in 3722 cycles (64 bytes)
test 7 (320 bit key, 256 byte blocks): 1 operation in 9279 cycles (256 bytes)
test 8 (320 bit key, 1024 byte blocks): 1 operation in 31360 cycles (1024 bytes)
test 9 (320 bit key, 8192 byte blocks): 1 operation in 239270 cycles (8192 bytes)
test 10 (384 bit key, 16 byte blocks): 1 operation in 1459 cycles (16 bytes)
test 11 (384 bit key, 64 byte blocks): 1 operation in 2862 cycles (64 bytes)
test 12 (384 bit key, 256 byte blocks): 1 operation in 8162 cycles (256 bytes)
test 13 (384 bit key, 1024 byte blocks): 1 operation in 29382 cycles (1024 bytes)
test 14 (384 bit key, 8192 byte blocks): 1 operation in 228704 cycles (8192 bytes)

testing speed of xts(aes) encryption
test 0 (256 bit key, 16 byte blocks): 1 operation in 1079 cycles (16 bytes)
test 1 (256 bit key, 64 byte blocks): 1 operation in 2075 cycles (64 bytes)
test 2 (256 bit key, 256 byte blocks): 1 operation in 5939 cycles (256 bytes)
test 3 (256 bit key, 1024 byte blocks): 1 operation in 21395 cycles (1024 bytes)
test 4 (256 bit key, 8192 byte blocks): 1 operation in 166475 cycles (8192 bytes)
test 5 (384 bit key, 16 byte blocks): 1 operation in 1155 cycles (16 bytes)
test 6 (384 bit key, 64 byte blocks): 1 operation in 2265 cycles (64 bytes)
test 7 (384 bit key, 256 byte blocks): 1 operation in 6585 cycles (256 bytes)
test 8 (384 bit key, 1024 byte blocks): 1 operation in 23865 cycles (1024 bytes)
test 9 (384 bit key, 8192 byte blocks): 1 operation in 185980 cycles (8192 bytes)
test 10 (512 bit key, 16 byte blocks): 1 operation in 1155 cycles (16 bytes)
test 11 (512 bit key, 64 byte blocks): 1 operation in 2265 cycles (64 bytes)
test 12 (512 bit key, 256 byte blocks): 1 operation in 6585 cycles (256 bytes)
test 13 (512 bit key, 1024 byte blocks): 1 operation in 23865 cycles (1024 bytes)
test 14 (512 bit key, 8192 byte blocks): 1 operation in 185969 cycles (8192 bytes)

testing speed of xts(aes) decryption
test 0 (256 bit key, 16 byte blocks): 1 operation in 1065 cycles (16 bytes)
test 1 (256 bit key, 64 byte blocks): 1 operation in 2063 cycles (64 bytes)
test 2 (256 bit key, 256 byte blocks): 1 operation in 5927 cycles (256 bytes)
test 3 (256 bit key, 1024 byte blocks): 1 operation in 21383 cycles (1024 bytes)
test 4 (256 bit key, 8192 byte blocks): 1 operation in 166463 cycles (8192 bytes)
test 5 (384 bit key, 16 byte blocks): 1 operation in 1141 cycles (16 bytes)
test 6 (384 bit key, 64 byte blocks): 1 operation in 2253 cycles (64 bytes)
test 7 (384 bit key, 256 byte blocks): 1 operation in 6573 cycles (256 bytes)
test 8 (384 bit key, 1024 byte blocks): 1 operation in 23853 cycles (1024 bytes)
test 9 (384 bit key, 8192 byte blocks): 1 operation in 185957 cycles (8192 bytes)
test 10 (512 bit key, 16 byte blocks): 1 operation in 1141 cycles (16 bytes)
test 11 (512 bit key, 64 byte blocks): 1 operation in 2253 cycles (64 bytes)
test 12 (512 bit key, 256 byte blocks): 1 operation in 6573 cycles (256 bytes)
test 13 (512 bit key, 1024 byte blocks): 1 operation in 23853 cycles (1024 bytes)
test 14 (512 bit key, 8192 byte blocks): 1 operation in 185957 cycles (8192 bytes)
===============================


Regards,
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: Kernel oops with 2.6.26, padlock and ipsec: probably pro..., Wolfgang Walter, (Tue Aug 12, 7:43 am)