> * H. Peter Anvin (
hpa@zytor.com) wrote:
>> On 01/12/2010 08:26 AM, Jason Baron wrote:
>>> Add text_poke_fixup() which takes a fixup address to where a processor
>>> jumps if it hits the modifying address while code modifying.
>>> text_poke_fixup() does following steps for this purpose.
>>>
>>> 1. Setup int3 handler for fixup.
>>> 2. Put a breakpoint (int3) on the first byte of modifying region,
>>> and synchronize code on all CPUs.
>>> 3. Modify other bytes of modifying region, and synchronize code on all CPUs.
>>> 4. Modify the first byte of modifying region, and synchronize code
>>> on all CPUs.
>>> 5. Clear int3 handler.
>>>
>>
>> We (Intel OTC) have been able to get an *unofficial* answer as to the
>> validity of this procedure; specifically as it applies to Intel hardware
>> (obviously). We are working on getting an officially approved answer,
>> but as far as we currently know, the procedure as outlined above should
>> work on all Intel hardware. In fact, we believe the synchronization in
>> step 3 is in fact unnecessary (as the synchronization in step 4 provides
>> sufficient guard.)
>
> Hi Peter,
>
> This is great news! Thanks to Intel OTC and yourself for looking into
> this. In the immediate values patches, I am doing the synchronization at
> the end of step (3) to ensure that all remote CPUs issue read memory
> barriers, so the stores to the instruction are done in this order:
>
> spin lock
> store int3 to 1st byte
> smp_wmb()
> sync all cores
> store new instruction in all but 1st byte
> smp_wmb()
> issue smp_rmb() on all cores (a sync all cores has this effect)
> store new instruction to 1st byte
> send IPI to all cores (or call synchronize_sched()) to wait for all
> breakpoint handlers to complete.
> spin unlock
>
> So the question is: are these wmb/rmb pairs actually needed ? As the
> instruction fetch is not performed by instructions per se, I doubt a
> rmb() will have any effect on them. I always prefer to stay on the safe
> side, but it wouldn't hurt to know.
>