No
Yes.
Right, you were correct to say my barrier() suggestion was wrong.
IOW: David is right. You need a cpu-barrier one way or the other. We
can either allow ->release() to imply one (and probably document it that
way, like we did for slow-work), or we can be explicit. I chose to be
explicit since it is kind of self-documenting, and there is no need to
be worried about performance since the release is slow-path.
OTOH: If you feel strongly about it, we can take it out, knowing that
most anything the properly invalidates the memory will likely include an
implicit barrier of some kind.
Kind Regards,
-Greg