The additional characteristic of the int3 instruction (compared to the
general case of a single-byte instruction) is that, when executed, it
will trigger a trap, run a trap handler and return to the original code,
typically with iret. This therefore implies that a serializing
instruction is executed before returning to the instructions following
the modification site when the breakpoint is hit.
So I hand out to Intel's expertise the question of whether single-byte
instruction modification is safe or not in the general case. I'm just
pointing out that I can very well imagine an aggressive superscalar
architecture for which pipeline structure would support single-byte int3
patching without any problem due to the implied serialization, but would
not support the general-case single-byte modification due to its lack of
serialization.
As we might have to port this algorithm to Itanium in a near future, I
prefer to stay on the safe side. Intel's "by the book" recommendation is
more or less that a serializing instruction must be executed on all CPUs
before new code is executed, without mention of single-vs-multi byte
instructions. The int3-based bypass follows this requirement, but the
single-byte code patching does not.
Unless there is a visible performance gain to special-case the
single-byte instruction, I would recommend to stick to the safest
solution, which follows Intel "official" guide-lines too.
Thanks,
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--