Yes, fixing that behavior will be tough. Just consider a standard page
cache I/O that gets merged with other I/O. You would need to "split" the
interrupt time for a block I/O to the process that benefit from it. An
added twist is that there can be multiple processes that require the
page. Split the time even more to the different requesters of a page?
Then the order when the requests come in suddenly gets important. Or
consider the IP packets in a network buffer, split the interrupt time
to the recipients?
The list goes on and on, my guess is that it will be next to impossible
to do it right. If the current situation is wrong because the ire
and softorq system time gets misaccounted and the "correct" solution is
impossible the only thing left to do is to stop accounting irq and
softirq time to processes.
That makes sense to me, with a working TSC the overhead should be
small. But you will need to a performance analysis to prove it.
Well, the task and cgroup information is there but what does it really
tell me? As long as the irq & softirq time can be caused by any other
process I don't see the value of this incorrect data point.
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
--