Re: [RFC patch 0/5] genirq: add infrastructure for threaded interrupt handlers

Previous thread: [RFC patch 4/5] genirq: add a helper to check whether the irq thread should run by Thomas Gleixner on Wednesday, October 1, 2008 - 4:02 pm. (1 message)

Next thread: [RFC patch 1/5] genirq: make irqreturn_t an enum by Thomas Gleixner on Wednesday, October 1, 2008 - 4:02 pm. (1 message)
From: Thomas Gleixner
Date: Wednesday, October 1, 2008 - 4:02 pm

This patch series implements support for threaded irq handlers for the
generic IRQ layer.

Threaded interrupt handlers are not only interesting in the preempt-rt
context. Threaded interrupt handlers can help to address common
problems in the interrupt handling code:

 - move long running handlers out of the hard interrupt context

 - avoid complex hardirq -> tasklet/softirq interaction and locking
   problems by integration of this functionality into the threaded
   handler code

 - improved debugability of the kernel: faulty handlers do not take
   down the system.

 - allows prioritizing of the handlers which share an interrupt line

The implementation provides an opt-in mechanism to convert drivers to
the threaded interrupt handler model contrary to the preempt-rt patch
where the threaded handlers are enabled by a brute force switch. The
brute force switch is suboptimal as it does not change the interrupt
handler -> tasklet/softirq interaction problems, but was the only way
which was possible for the limited man power of the preempt-rt
developers.

Converting an interrupt to threaded makes only sense when the handler
code takes advantage of it by integrating tasklet/softirq
functionality and simplifying the locking.

When a driver wants to use threaded interrupt handlers it needs to
provide a quick check handler function, which checks whether the
interrupt was originated from the device or not.

In case it was originated from the device the quick check handler must
disable the interrupt at the device level and return
IRQ_WAKE_THREAD. The generic interrupt handling core then sets the
IRQF_RUNTHREAD in the irqaction of the handler and wakes the
associated thread.

The irqaction is referenced in the threads task_struct so handlers can
check for newly arrived interrupts in action flags. Aside of that the
reference is used to prevent further referencing of the thread in the
interrupt code in the case of segfault to make sure that the system
(minus the now dead ...
From: Andrew Morton
Date: Wednesday, October 1, 2008 - 4:23 pm

On Wed, 01 Oct 2008 23:02:08 -0000

That would be nice ;)


I'm a bit surprised to see that there is no facility for per-cpu
interrupt threads?

--

From: Arjan van de Ven
Date: Wednesday, October 1, 2008 - 4:29 pm

On Wed, 1 Oct 2008 16:23:33 -0700

per handler is the right approach (that way, if one dies, all other
interrupts will likely keep working)

now.. normally an interrupt only goes to one cpu, so effectively it is
per cpu already anyway

we should however make the irq threads follow the affinity masks of the
irq... that'd be an easy add-on and probably worthwhile.


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Andrew Morton
Date: Wednesday, October 1, 2008 - 4:40 pm

On Wed, 1 Oct 2008 16:29:50 -0700

Yes, if a) the thread was asleep when it was woken up and b) if the
scheduler does the right thing and wakes the thread on the CPU which
called wake_up().

The ongoing sagas of tbench/mysql/volanomark regressions make me think
that any behaviour which we "expect" of the scheduler should be

--

From: Thomas Gleixner
Date: Wednesday, October 1, 2008 - 4:58 pm

Yup. I missed that detail when I dusted off the moldy patches. 

Of course we need to pin the thread to the affinity mask of the
hardware interrupt.

/me goes back to do home work :)

Thanks,

	tglx
--

From: Jon Masters
Date: Wednesday, October 1, 2008 - 5:40 pm

Sven and I started poking at the various USB host drivers on the flight
back from Plumbers. I'll see if I can convert over a few here on the
systems that I have and send over some patches.

Jon.


--

From: Greg KH
Date: Thursday, October 2, 2008 - 3:07 pm

There's only 3, how hard could it be?  :)

/me runs away quickly
--

From: Ingo Oeser
Date: Wednesday, October 8, 2008 - 3:18 pm

Hi there,

[CC'ed some distribution specific experts]


Heh! That feature is the only thing, that keeps my crappy combination
of hardware and Ubuntu Hardy together :-)

With the non-RT Ubuntu kernel flavors, it just hangs in that situation. 
It looks like this at the moment with a Ubuntu RT-Kernel, in case someone 
likes to debug that:

[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Linux version 2.6.24-21-rt (buildd@palmer) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP PREEMPT RT Mon Aug 25 19:24:40 UTC 2008 (Ubuntu 2.6.24-4.6-generic)
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
[    0.000000]  BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 00000000cfff0000 (usable)
[    0.000000]  BIOS-e820: 00000000cfff0000 - 00000000cfff3000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000cfff3000 - 00000000d0000000 (ACPI data)
[    0.000000]  BIOS-e820: 00000000d0000000 - 00000000e0000000 (reserved)
[    0.000000]  BIOS-e820: 00000000e8000000 - 00000000f0000000 (reserved)
[    0.000000]  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
[    0.000000]  BIOS-e820: 0000000100000000 - 0000000130000000 (usable)
[    0.000000] Warning only 4GB will be used.
[    0.000000] Use a HIGHMEM64G enabled kernel.
[    0.000000] 3200MB HIGHMEM available.
[    0.000000] 896MB LOWMEM available.
[    0.000000] found SMP MP-table at 000f4b40
[    0.000000] Entering add_active_range(0, 0, 1048576) 0 entries of 256 used
[    0.000000] Zone PFN ranges:
[    0.000000]   DMA             0 ->     4096
[    0.000000]   Normal       4096 ->   229376
[    0.000000]   HighMem    229376 ->  1048576
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[1] active PFN ranges
[    0.000000]     0:        0 ->  1048576
[    0.000000] On node 0 totalpages: ...
From: Daniel Walker
Date: Wednesday, October 1, 2008 - 6:53 pm

I'm not clear on your direction here.. I don't have a problem with a
mass driver audit, which I think is what your suggesting with this patch
set .. However, a mass audit like that would push a fully real time
system out for quite some time..

I also don't see a clear connection between these changes and ultimately
removing spinlock level latency in the kernel. I realize you don't
address that in your comments, but this is part of the initiative to
remove spinlock level latency..

So with this set of changes and in terms of real time, I'm wonder your
going with this ?

Daniel

--

From: Steven Rostedt
Date: Thursday, October 2, 2008 - 8:02 am

This helps with latencies and locking. With the current scheme of hardirq, 
softirq/tasklets, there are a lot of craziness with spin_locks and 
spin_lock_irqs and mutexes.

By creating an interrupt thread, we can skip the softirq/tasklet 
altogether, and this simplifies locking.

There are other cases where threaded interrupt handlers also improve 
performance. But we will get to those in due time.

-- Steve

--

From: Daniel Walker
Date: Thursday, October 2, 2008 - 8:48 am

Clearly threading irq handlers does have something to do with real time,

It's all connected to the removal of latency .. One part depending on

How does this simplify locking ?

Daniel

--

From: Thomas Gleixner
Date: Thursday, October 2, 2008 - 11:42 am

Clearly you have neither clue about real time nor about operating
systems in general. 

Solaris, some BSDs and MacOSX use interrupt threads. Where exactly is
the relation to realtime?

The concept of interrupt threads is nothing which is in any way
related to real time. It is a well known and pretty old concept in
operating system design.

The fact that real time operating systems benefit from interrupt
threads is a totally different topic.

	tglx
	
--

From: Daniel Walker
Date: Thursday, October 2, 2008 - 12:04 pm

Here we go again Thomas.. You think you can have a conversation without

The very fact that you mention it in your release notes .. You mention

The fact that a direct relationship exists means that any threaded
interrupt system needs to take into account the inevitable connection to
real time since it will be used in that system as a core component.. If
you can't effectively achieve real time with your system , than that's a
problem that needs to be addressed. 

Daniel

--

From: Steven Rostedt
Date: Thursday, October 2, 2008 - 12:23 pm

Daniel, what kind of logic is this? I was already accused of being on 
crack today (but was just too much coffee). Perhaps you might be the one 
that's on crack.

I build a pipe. There exists a relationship between a pipe and crap 
running through it from my toilet. Does this mean that every time I need a 
pipe, that I need to take into account the inevitable connection to crap 
to run through it?

God, I can see the problems with my gas lines.

-- Steve

--

From: Ingo Molnar
Date: Thursday, October 2, 2008 - 12:28 pm

Well, that's clearly wrong: threaded IRQ handlers are not tied to 
real-time in any way. Yes, they can be used for RT too but as far as the 
upstream kernel is involved that's at most an afterthought.

and the "unless this patch isn't actually threading anything" bit does 
not parse at all. The patches execute hard-IRQ handlers from special 

what Thomas said was a strong but fair reaction to your obviously 
incorrect statement. What reaction did you expect?

	Ingo
--

From: Daniel Walker
Date: Thursday, October 2, 2008 - 1:09 pm

You contradict yourself .. I said "Clearly threading irq handlers does
have something to do with real time" then you say "they can be used for 
RT too" .. So my comments are clearly correct , they have "something" to
do with real time. There exists a relationship of some kind or type.


We need less insults on this list not more.. Maybe I could understand
his reaction had I insulted him, but I didn't.. "Fair reaction" doesn't
fit in this case ..

Daniel

--

From: Steven Rostedt
Date: Thursday, October 2, 2008 - 1:14 pm

What Ingo is telling you is:

  - RT needs threaded interrupts.

  - Threaded interrupts do not need RT

My dog is an Italian Greyhound.

Italian Greyhound is a dog, but
a dog is not an Italian Greyhound.

-- Steve

--

From: Daniel Walker
Date: Thursday, October 2, 2008 - 1:48 pm

My comments are basically bidirectional , so what your saying doesn't
make any sense .. I said basically, that dogs and "Italian Greyhounds"
have _some_ connection .. Why are we even debating this.

Daniel

--

From: Steven Rostedt
Date: Thursday, October 2, 2008 - 2:05 pm

Why are you bringing up real time in this thread?? The thread has
absolutely nothing to do with real time. This thread is about a better

Again, this thread has nothing to do with removing spinlock level latency.

You brought in this relationship with real time, just because real time 
uses threaded interrupts. This thread has nothing to do with real time. 
That is what Ingo, Thomas and myself are trying to ge through to you.

The strong reaction from Thomas is that you just brought up something that 
is completely off topic.

Basically, drop the real time topic from this thread. It's not related. 
Yes real time addresses threaded interrupts, but just because we are 
talking about threaded interrupts does not mean we are talking about real 
time.

Actually my analogy with dogs and IG's is wrong. The pipe and crap analogy 
is better. Crap uses pipes to get out of the house. But if I'm talking 
about pipes, I don't want to hear about crap.

We are debating this because YOU brought it up!

-- Steve

--

From: Daniel Walker
Date: Thursday, October 2, 2008 - 2:30 pm

I'm concerned about the connection between the two, which is what I'm

If they are connected (which I think we established) , then it's not out
of line for me to discuss the direction of these changes as related to

You know Steven, often times you start a conversation and you have no
idea where it will end up.. You can't always control which direction it

We already debated this fact Steven. real time and this type of
threading are connected. It's not off topic to discuss connected
components.

If the intent here is to totally disconnect these threading patches from
any type of real time in the future, then that's a good answer to my
original question .. That these changes have no future what so ever in
regards to real time.

If they will be used in the future for real time then we should discuss

I don't see why you are so concerned with this.. Real time is taboo now?

Daniel

--

From: Steven Rostedt
Date: Thursday, October 2, 2008 - 3:28 pm

Well, please take that up separately. Do you see these patches going
into the -rt tree?  No, they are going in mainline. We will deal with

You are bringing up concerns about mainline changes with something that
is maintained outside the mainline tree.  Changes to mainline have never

Yes Daniel, I know. But this is not a conversation. This is a email thread
that is talking about changes to mainline. The mainline kernel developers
really don't care about any issues that these changes will do to the
real time project. The real time project is a niche, and is currently
outside the mainline tree. Hence, lets stop bothering mainline 

No Daniel, it is off topic.  The thread is not about real time issues.
This thread is about mainline. If you have an issue that these changes
will make to the current mainline tree, then please, by all means, bring

No the intent here is to handle mainline issues. The real time issues you
consistantly bring up are not important to most kernel developers. If
you have real time issues with this change, bring that up on a real 
time forum. Not in this thread.  The changes in this thread are dealing
with mainline interrupt handlers. There have been several kernel device 
driver writers who asked us to get interrupt threads in mainline. This was 
not about real time, this was about helping out mainline kernel 

Not at all, Daniel, but this thread is not the appropriate place to 
discuss your real time concerns. You are asking about what this patch has 
to do with the future real time direction. Who on this thread cares?
(besides you)

The topic for mainline patch threads needs to stay focused on mainline. 
Not on out of tree branches. When the rest of real time is in mainline, 
then sure, you can discuss those concerns then. But until that happens, 
keep to the topic of the thread. Which for now is not real time.

-- Steve

--

From: Daniel Walker
Date: Thursday, October 2, 2008 - 4:24 pm

Your speaking for a lot of developers.. It's an RFC, it's coming from
real time developers, it's real time connected, and this is the real
time development list ..


The issues I've brought up are specifically design comments/concerns
related to future directions.. I was not at all speaking to your real

Real time forum ? That's what this list is .. If you want this thread to

People that don't care about real time related comments, they can stop
reading the thread.. That's the nature of this list ..

Daniel

--

From: Thomas Gleixner
Date: Thursday, October 2, 2008 - 5:26 pm

This RFC patch is from a mainline developer/ maintainer who happens to
be a real time developer as well.

Welcome to my real time incoherent nonsense filter

	tglx
--

From: Daniel Walker
Date: Friday, October 3, 2008 - 7:44 am

You flagged this as an RFC, typically that means it's a work in



You ask for comments, than get upset when you get some .. If anyone
needs to be filtered it's you.

Daniel

--

From: Andi Kleen
Date: Thursday, October 2, 2008 - 7:46 am

I'm not sure I'm really looking forward to this brave new world
of very long running interrupt handlers. e.g. what do you

I had an old patch to handle this without threaded interrupts. 

What normally happens is when a interrupt oopses it tries to kill the
idle process which panics.  My fix was to just restart another idle
process instead of panicing.

But back then it was rejected by Linus with the argument that
a crashing interrupt handler will typically hold some lock
and the next time the interrupt happens it will deadlock on that
lock. 

Has that changed with your threaded interrupts? 

If it has changed I suspect the restart idle change could

To be honest my opinion is that it will encourage badly written interrupt
code longer term.

-Andi

-- 
ak@linux.intel.com
--

From: Greg KH
Date: Thursday, October 2, 2008 - 2:31 pm

We have this issue today with some irqs (USB is known for issue here...)

So I don't think this is a big issue, and in the end, a better idea as
it might force us to confront some of the big abusers and fix them.

thanks,

greg k-h
--

From: Arjan van de Ven
Date: Thursday, October 2, 2008 - 3:33 pm

On Thu, 2 Oct 2008 14:31:46 -0700

one of the things irq threads gives you is that 'top' will show you
which ones are eating cpu ;-)


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Andi Kleen
Date: Thursday, October 2, 2008 - 8:25 pm

oprofile does that job fine already and is imho any time preferable
for detailed analysis.

-andi

-- 
ak@linux.intel.com
--

From: Arjan van de Ven
Date: Thursday, October 2, 2008 - 8:48 pm

On Fri, 3 Oct 2008 05:25:04 +0200

while I don't disagree that oprofile will give you more detailed
results, I think there's a HUGE difference between asking a bugreporter
"can you paste a screen of 'top'" and "can you configure and run
oprofile".
CHances are good that the user already thought of top him/herself and
just reports "interrupt X is eating CPU" rather than "something seems
to be eating CPU".

I'm not going argue that this alone is enough justification for
irqthreads, but you can't deny it's an advantage.


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Andi Kleen
Date: Thursday, October 2, 2008 - 9:35 pm

Perhaps the better fix would be to make oprofile easier to configure.
I never quite understood for example why it doesn't get the symbol
table simply from the kallsyms. Or simply unpacks gzip'ed vmlinux

Do we have that many cases of runaway irqs? The only common
one I can think of is ACPI, but that is a separate thread already.

Or for networking high performance goes into polling mode
and most of the work is outside hardirq so it wouldn't be visible
in the thread statistics either. But even there livelocks are not
very common.

Also that would assume that the proposed opt in irq threads are used
for all interrupts.

-Andi
--

From: Andi Kleen
Date: Thursday, October 2, 2008 - 8:23 pm

Not sure I follow? You're saying you want to first allow very long running
IRQ handlers and then after the fact somehow fix them?  Sounds like a weird
proposal to be honest.

The problem of course is that if you have such a very slow hardirq (let's
say one that runs for tens of ms) that what happens when another
interrupt comes in? How deep is your hardware queue?  Or will you 
just lose events in that case?

I suspect in the end you'll need another "fast interrupt" anyways
if the hardware queue is not deep enough to buffer at the software
side. Sounds a bit like the existing "interrupts" vs "softirq" 
doesn't it?

So I'm just not sure what the "slow interrupt handler" will give
you longer term. It just sounds like a redundant concept to me.

The other issue is that if you allow sleeping locks in the interrupt
handler what guarantee is that it won't block for even longer than
tens of milliseconds? And lose even more events.

-Andi

-- 
ak@linux.intel.com
--

From: Sven-Thorsten Dietrich
Date: Monday, October 20, 2008 - 6:32 pm

Subject: genirq threading for ehci, ohci and uhci USB hosts.

This is a functional POC for partitioning of the USB interrupt handlers into
a quickcheck function and a threaded main-handler function.

The quickcheck performs the following actions only: 

        1. read status and stores.
        2. check if the device is asserting, disable device assert
        3. wake irq thread or return IRQ_NONE (shared IRQ not this USB)

The handler thread performs all other operations, and is conditionally
invoked via return code from quickcheck. 

A function pointer *quickcheck() has been added, as well as a status
cache, which remembers the status reported by the chip. The status must
be passed to the threaded handler, since it is cleared by the quickcheck
function.

Currently, non-reentrancy is assumed, i.e. the chip will not assert another
IRQ until the thread has completed running.

In contrast with that, I have tightened up the locking between the 
quickcheck handler and the main thread, this may be overly conservative. 

The WARN_ONCE notes when status reported by the chip does not match 
cached status, although this is expected. Its to be removed.

Note, that locking is not explicitly used for ohci as it is 
in uhci and ehci.

The ieee1394/ohci1394.c is a bit more problematic, since it does not use
struct usb_hcd, and therefore another place for the status cache must be found,
in struct ti_ohci presumably. 
I will send a separate patch for ieee1394/ohci1394.c after looking at it more.

The patch increases the code size by 100 lines, but there are some cleanups 
that account for maybe a dozen lines, which can be broken-out.

Infrastructure:

This hack in kernel/irq/handle.c needs to be looked at, set_bit failed to work
properly on my AMD x86_64 machine, not further investigated and the code path
is currently not triggered by this patch.
+			//set_bit(IRQF_WARNED_THREADED, &action->flags);
+			action->flags |= IRQF_WARNED_THREADED;

Testing:

I have copied ...
From: Jonathan Woithe
Date: Monday, October 20, 2008 - 7:07 pm

Thanks for sending this through; I will endeavour to test it over the coming
days.  I'll also see if I can test it for the case where USB and ieee1394
share IRQs.  Even though ieee1394 isn't done yet these changes should
prevent the USB-related problems we've seen in FFADO when USB shares the
firewire IRQ.

Having said that, the forthcoming ohci1394.c patch will also be interesting
since that is the other area which will greatly benefit FFADO.

Regards
  jonathan
--

From: Sven-Thorsten Dietrich
Date: Tuesday, October 21, 2008 - 3:29 am

Hi, 

I have re-released the patch set with support for ohci1394. 

I have not tested, for lack of hardware :: it compiles.

The location of status_cache is now within the ti_ohci struct.

That organization not entirely consistent with the USB side, predicated
by other variation between the drivers.

The updated patch queue is available here.

http://www.thebigcorporation.com/Sven/genirq-usb/genirq-usb-v2.tar.bz2

Bug fix:
The declaration of status_cache has been made consistent u32. 
The former declaration of unsigned short (copied from uhci-hcd.c) it was
probably too ambiguous for some architectures.

Thanks,


--

Previous thread: [RFC patch 4/5] genirq: add a helper to check whether the irq thread should run by Thomas Gleixner on Wednesday, October 1, 2008 - 4:02 pm. (1 message)

Next thread: [RFC patch 1/5] genirq: make irqreturn_t an enum by Thomas Gleixner on Wednesday, October 1, 2008 - 4:02 pm. (1 message)