Can you guarantee that the pointer dereference cannot be optimised away
on any architecture? Without other restrictions, a suficiently
intelligent optimiser could notice that the address of v doesn't change
in the loop and the destination is never written within the loop, so the
read could be hoisted out of the loop.
Even now, powerpc (as an example) defines atomic_t as:
typedef struct { volatile int counter; } atomic_t
That volatile is there precisely to force the compiler to dereference it
every single time.
Chris
-