A user space spinlock plays into this??? That is irrelevant to the kernel.
And we are discussing "your" placement of the invalidate_range not mine.
This is the scenario that I described before. You just need two threads.
One thread is in do_wp_page and the other is writing through the spte.
We are in do_wp_page. Meaning the page is not writable. The writer will
have to take fault which will properly serialize access. It a bug if the
spte would allow write.
--