Just an idea, by using atomic_ops page_cgroup patch, you can encode page_cgroup->lock
to page_cgroup->flags and use bit_spinlock(), I think.
(my new patch set use bit_spinlock on page_cgroup->flags for avoiding some race.)
This will save extra 4 bytes.
Thanks,
-Kame
--