This is why it shouldn't belong in the sched or kthread code; the
discrepency that you point out between p->cpus_allowed and
task_cs(p)->cpus_allowed is a cpuset created one.
So to avoid having tasks with a cpus_allowed mask that is not a subset of
its cpuset's set of allowable cpus, the solution would probably be to add
a flavor of cpuset_update_task_memory_state() for a cpus generation value.
--