Am Freitag, 8. August 2008 12:36 schrieb Wolfgang Walter:
It now runs for 8 hours.
So how to proceed?
Here a summary:
1) 2.6.25.x works
2) 2.6.26 without padlock works
3a) 2.6.26 using padlock and ipsec chrashes.
Always the same reason:
__switch_to() wants to fxsave but there is no memory allocated
which means that TS_USEDFPU has been set.
3b) 2.6.26 with padlock code from 2.6.25: using padlock and ipsec chrashes.
4) Reverting the fpu patches (from 2.6.25 to 2.6.26) fixes the problem.
5) Protecting the padlock cmds with kernel_fpu_begin(); kernel_fpu_begin();
fixes the problem.
Some thoughts:
4) probably fixes the problem because memory for fxsave is always allocated.
Maybe this is also the reason why 2.6.25 and earlier work.
5) is interesting:
kernel_fpu_begin() will save the fpu state if TS_USEDFPU is set - exactly as
__unlazy_fpu in __switch_to would do. So when the crypto code calls padlock
the world is ok: if TS_USEDFPU is set then memory has been allocated.
This makes me rather sure that there is no memory corruption by the network
code and/or crypto code itself overwriting the task_info stucture and setting
TS_USEDFPU.
I don't know though why kernel_fpu_begin exactly fixes the problem:
Maybe preemption must be disabled when padlocks RNG, ACE, Hash-engine etc. is
used. Maybe just some barrier is needed which preempt_disable provides.
The padlock programming guide states that padlock's RNG, ACE, ... internally
use SSE. Especially "it temporary enables SSE until no further SSE
instruction or XSTORE, ... is seen". Further it may overlap with execution of
non-SSE instructions.
Probably it is the best to treat the padlock opcodes as SSE instructions.
What should I do now?
Regards,
--
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts
--