As you say, this is only a liveness issue. The atomic_inc_not_zero
guarantees that we don't take a new reference after the last one is gone.
The test on SYSFS_FLAG_REMOVED is only there to ensure that the count does
eventually get to zero.
There could only be a problem here if the change to s_flags was not
propagated to all CPUs in some reasonably small time.
I'm not expert on these things but it was my understanding that interesting
cache architectures could arbitrarily re-order accesses, but does not delay
them indefinitely.
Inserting barriers could possibly make this more predictable, but that would
just delay certain loads/stores until a known state was reached - it would
not make the data visible at another CPU any faster (or would it?).
So unless there is no cache-coherency protocol at all, I think that
SYSFS_FLAG_REMOVED will be seen promptly and that s_active will drop to zero
and quickly as it does today.
I'm pleased you find it attractive - I certainly think the
"atomic_inc_not_zero" is much more readable than the code it replaces.
Hopefully if there are really problems (maybe I've fundamentally
misunderstood caches) they can be easily resolved (a couple of memory
barriers at worst?).
Thanks for the review,
NeilBrown
--