Hello, Is there any way to setup IRQ masks from within a driver? myri10ge currently relies on an external script (writing in /proc/irq/*/smp_affinity) to bind each queue/MSI-X to a different processor. By default, Linux will either: * round-robin the interrupts (killing the benefit of DCA for instance) * put all IRQs on the same CPU (killing much of the benefit of multislices) With more and more drivers using multiqueues, I think we need a nice way to bind MSI-X from within the drivers. I am not sure what's best, the attached (untested) patch would just export the existing irq_set_affinity() and add irq_get_affinity(). Comments? thanks, Brice
From: Brice Goglin <Brice.Goglin@inria.fr> I think we should rather have some kind of generic thing in the IRQ layer that allows specifying the usage model of the device's interrupts, so that the IRQ layer can choose a default affinities. I never notice any of this complete insanity on sparc64 because we flat spread out all of the interrupts across the machine. What we don't want it drivers choosing IRQ affinity settings, they have no idea about NUMA topology, what NUMA node the PCI controller sits behind, what cpus are there, etc. and without that kind of knowledge you cannot possible make affinity decisions properly. --
As long as we get something better than the current behavior, I am fine with it :) Brice --
On Thu, 28 Aug 2008 22:21:53 +0200 * do the right thing with the userspace irq balancer -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
It probably also needs to be hooked up the sched_mc_power_savings When the switch is on the interrupts shouldn't be spread out over that many sockets. Does it need callbacks to change the interrupts when that variable changes? Also I suspect handling SMT explicitely is a good idea. e.g. I would always set the affinity to all thread siblings in a core, not just a single one, because context switch is very cheap between them. -Andi -- ak@linux.intel.com --
On Fri, 29 Aug 2008 18:48:12 +0200 that is what irqbalance already does today, at least for what it considers somewhat slower irqs. for networking it still sucks because the packet reordering logic is per logical cpu so you still don't want to receive packets from the same "stream" over multiple logical cpus. -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
That is true, but don't they also "compete" for pipeline resources? rick jones --
