* This patch replaces the dangerous lvalue version of cpumask_of_cpu
with new cpumask_of_cpu_ptr macros. These are patterned after the
node_to_cpumask_ptr macros.In general terms, if there is a cpumask_of_cpu_map[] then a pointer to
the cpumask_of_cpu_map[cpu] entry is used. The cpumask_of_cpu_map
is provided when there is a large NR_CPUS count, reducing
greatly the amount of code generated and stack space used for
cpumask_of_cpu(). The pointer to the cpumask_t value is needed for
calling set_cpus_allowed_ptr() to reduce the amount of stack space
needed to pass the cpumask_t value.If there isn't a cpumask_of_cpu_map[], then a temporary variable is
declared and filled in with value from cpumask_of_cpu(cpu) as well as
a pointer variable pointing to this temporary variable. Afterwards,
the pointer is used to reference the cpumask value. The compiler
will optimize out the extra dereference through the pointer as well
as the stack space used for the pointer, resulting in identical code.A good example of the orthogonal usages is in net/sunrpc/svc.c:
case SVC_POOL_PERCPU:
{
unsigned int cpu = m->pool_to[pidx];
cpumask_of_cpu_ptr(cpumask, cpu);*oldmask = current->cpus_allowed;
set_cpus_allowed_ptr(current, cpumask);
return 1;
}
case SVC_POOL_PERNODE:
{
unsigned int node = m->pool_to[pidx];
node_to_cpumask_ptr(nodecpumask, node);*oldmask = current->cpus_allowed;
set_cpus_allowed_ptr(current, nodecpumask);
return 1;
}Based on linux-2.6.tip/master at the following commit:
commit 0a91813e16ebd5c2d9b5c2acd5b7c91742112c4f
Merge: 9a635fa... 724dce0...
Author: Ingo Molnar <mingo@elte.hu>
Date: Tue Jul 15 14:55:17 2008 +0200Signed-off-by: Mike Travis <travis@sgi.com>
* ACPI
Cc: Len Brown <len.brown@intel.com>* CPUFREQ
Cc: Dave Jones <davej@codemonkey.org.uk>* CPUMASK
Cc: Paul Jackson <pj@sgi.com>...
Hi Mike,
Should we just put cpumask_of_cpu_map[] in generic code and then have
cpumask_of_cpu() always return a cpumask_t pointer? These macros which
declare things which may be one of two types is a real penalty for code
readability.Thanks,
Rusty.
--
Hi,
I wouldn't mind it at all, and since it's almost always calling a function
that requires a cpumask_t pointer (like the cpu_* ops or set_cpus_allowed_ptr)
then there shouldn't be too many "pointer dereference" penalties. I'm just
always a bit hesitant to make too many generic changes since I have only x86
and ia64 machines to test with.But there's a few of these new "fake" pointer macros (well, at least two... ;-),
so we'll either need more of these types of macros, or we have to consider using
pointers for almost all cpumask_t args. The next jump to 16k cpus will use
2k bytes of stack space for each cpumask_t arg, instead of the current "measly"
512 bytes.Another thought I had is perhaps cpumask.h should define something that indicates
a "huge NR_CPUS count" that is used globally to trigger things like kmalloc of
cpumask variables, instead of declaring them on the stack...? Or (as has been
discussed in the past), maybe a new cpumask_t type will be needed?Thanks,
Mike--
The simple version is just a static array of [NR_CPUS] cpumask_t's. Do that,
with an override for smarter archs?AFAICT the final answer has to be a get_cpu_mask()/put_cpu_mask(), which
sleeps and doesn't nest (so we can use a pool allocator). Of course, that
kind of analysis is non-trivial, so I suggest that's not for this merge
window...Want me to try something and see if it boots?
Rusty.
--
Rusty Russell wrote:
Hi Rusty,
There are a number of occasions where a function declares a temporary cpumask_t
variable on the stack to hold (say) current->cpus_allowed. I tried a couple of
options early on to a.) reserve one or two cpumask_t variables in the task
struct; and b.) reserve one or two cpumask_t variables per cpu. Both had weird
consequences in some usages and since 4096 is *only* 512 bytes, it didn't seem
worth the effort. Our next iteration will have NR_CPUS=16384 and therefore
removing all stack declared cpumask_t variables is highly desirable.Your idea of a pool allocator is very interesting... ;-)
Thanks,
Mike
--
a fresh commit in -git has exposed the topology.h mess - see the hack
below. We now have diverging versions of topology_core_siblings()
semantics - that sure cannot be right. Mike?Ingo
------->
commit 695a6b456307455a10059512208e8ed0d376ecd3
Author: Ingo Molnar <mingo@elte.hu>
Date: Wed Jul 23 13:19:44 2008 +0200topology: work around topology_core_siblings() breakage
work around:
drivers/net/sfc/efx.c: In function ‘efx_probe_interrupts':
drivers/net/sfc/efx.c:845: error: lvalue required as unary ‘&' operandthe topology API is a mess right now ...
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
drivers/net/sfc/efx.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index 45c72ee..1ababfa 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -842,8 +842,10 @@ static void efx_probe_interrupts(struct efx_nic *efx)
for_each_online_cpu(cpu) {
if (!cpu_isset(cpu, core_mask)) {
++efx->rss_queues;
+#if 0
cpus_or(core_mask, core_mask,
topology_core_siblings(cpu));
+#endif
}
}
} else {--
Ahh, yes, I see it now. If you don't define topology_core_siblings then you get:
#ifndef topology_core_siblings
#define topology_core_siblings(cpu) cpumask_of_cpu(cpu)
#endifAnd of course this is no longer an lvalue.
Rusty - if you don't think there'll be objections from other arches I can put in
a generic cpumask_of_cpu_map[].Thanks,
Mike--
also i had to do the net/sunrpc/svc.c fixup below.
Ingo
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 5a32cb7..835d274 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -310,7 +310,8 @@ svc_pool_map_set_cpumask(struct task_struct *task, unsigned int pidx)
switch (m->mode) {
case SVC_POOL_PERCPU:
{
- set_cpus_allowed_ptr(task, &cpumask_of_cpu(node));
+ cpumask_of_cpu_ptr(cpumask, node);
+ set_cpus_allowed_ptr(task, cpumask);
break;
}
case SVC_POOL_PERNODE:
--
I had sent in a patch that has this change.
Standby, the "generic" cpumask_of_cpu_map patch is coming... just testing now.
THanks,
Mike
--
Thanks Mike, that's going to be a huge improvement in clarity.
Cheers,
Rusty.
--
| Bart Van Assche | Re: Integration of SCST in the mainstream Linux kernel |
| Greg Kroah-Hartman | [PATCH 005/196] Chinese: add translation of SubmittingDrivers |
| Pavel Roskin | ndiswrapper and GPL-only symbols redux |
| Linus Torvalds | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
git: | |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Paweł Staszewski | rib_trie / Fix inflate_threshold_root. Now=15 size=11 bits |
| David Miller | Re: [PATCH 3/3] Convert the UDP hash lock to RCU |
