"all writes done to it before it's exposed".
You make tons of assumptions.
You assume that
(a) unlocked accesses are the normal case and should be something the
allocator should prioritize/care about.
(b) that if you have a ctor, it's the only thing the allocator will do.
I don't think either of those assumptions are at all relevant or
interesting. Quite the reverse - I'd expect them to be in a very small
minority.
Now, obviously, on pretty much all machines out there (ie x86[-64] and UP
ARM), smp_wmb() is a no-op, so in that sense we could certainly say that
"sure, this is a total special case, but we can add a smp_wmb() anyway
since it won't cost us anything".
On the other hand, on the machines where it doesn't cost us anything, it
obviously doesn't _do_ anything either, so that argument is pretty
dubious.
And on machines where the memory ordering _can_ matter, it's going to add
cost to the wrong point.
Linus
--