In keeping with the spirit of adding new useful features when they help
solve real world problems, I've added this ability to collectl as a
direct result of a problem we were recently having when doing some
network performance testing. It turned out all the interrupts were
being processed by cpu0 (this was on an 8-core system) but all collectl
told us was the aggregate number! Once we moved to a NIC/driver that
supported multiple queues that could distribute interrupts to multiple
CPUs it only made sense to enhance collectl to let us verify that this
was indeed happening - I personally find examining /proc/interrupts for
changes more trouble than it's worth.
In any event, the following is an example of how collectl can present
this data, first summarized by CPU:
#Time Cpu0 Cpu1 Cpu2 Cpu3 Cpu4 Cpu5 Cpu6 Cpu7
12:49:55 4828 16K 1000 16K 18 16K 16K 0
12:49:56 4804 16K 1000 16K 0 16K 16K 0
12:49:57 4811 16K 1000 16K 18 16K 16K 0
12:49:58 4789 16K 1000 16K 0 17K 16K 44
and here by the type of the interrupt itself:
# INTERRUPT DETAILS
# Int Cpu0 Cpu1 Cpu2 Cpu3 Cpu4 Cpu5 Cpu6 Cpu7
Type Device(s)
12:48:50 082 0 0 0 7731 0 0 0
0 PCI-MSI-X eth2 (queue 0)
12:48:50 098 0 0 0 0 2037 0 0
0 PCI-MSI-X eth2 (queue 2)
12:48:50 122 0 0 2240 0 0 0 0
0 PCI-MSI-X eth2 (queue 5)
12:48:50 138 0 7084 0 0 0 0 0
0 PCI-MSI-X eth2 (queue 7)
12:48:50 154 0 0 0 0 0 7723 0
0 PCI-MSI-X eth3 (queue 0)
12:48:50 162 9082 0 0 0 0 0 0
0 PCI-MSI-X eth3 (queue 1)
12:48:50 178 0 0 0 0 0 0 8253
0 PCI-MSI-X eth3 (queue 3)
12:48:50 210 0 0 0 0 0 0 ...