Doesn't fly for -rt anymore: pic_irq_update runs under this raw lock and calls kvm_vcpu_kick which tries to wake_up some thread -> scheduling while atomic. This used to work up to 956f97cf. -rt for 2.6.31 is not yet affected, but 2.6.33 should be broken (haven't checked, using kvm-kmod over 2.6.31 ATM). I can provide a patch that restores the deferred kicking if it's acceptable for upstream. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux --
Well, at least is would be nice to have one for -rt. Thanks, tglx --
Here we go. Haven't run kvm.git long enough over -rt yet to say that it
was the only remaining issue, but at least it doesn't complain instantly
anymore when starting a VM.
Jan
---------->
This restores the deferred VCPU kicking before 956f97cf. We need this
over -rt as wake_up* requires non-atomic context in this configuration.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
arch/x86/kvm/i8259.c | 53 ++++++++++++++++++++++++++++++++++++--------------
arch/x86/kvm/irq.h | 1 +
2 files changed, 39 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index 07771da..ca426bd 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -32,6 +32,29 @@
#include <linux/kvm_host.h>
#include "trace.h"
+static void pic_lock(struct kvm_pic *s)
+ __acquires(&s->lock)
+{
+ raw_spin_lock(&s->lock);
+}
+
+static void pic_unlock(struct kvm_pic *s)
+ __releases(&s->lock)
+{
+ bool wakeup = s->wakeup_needed;
+ struct kvm_vcpu *vcpu;
+
+ s->wakeup_needed = false;
+
+ raw_spin_unlock(&s->lock);
+
+ if (wakeup) {
+ vcpu = s->kvm->bsp_vcpu;
+ if (vcpu)
+ kvm_vcpu_kick(vcpu);
+ }
+}
+
static void pic_clear_isr(struct kvm_kpic_state *s, int irq)
{
s->isr &= ~(1 << irq);
@@ -44,19 +67,19 @@ static void pic_clear_isr(struct kvm_kpic_state *s, int irq)
* Other interrupt may be delivered to PIC while lock is dropped but
* it should be safe since PIC state is already updated at this stage.
*/
- raw_spin_unlock(&s->pics_state->lock);
+ pic_unlock(s->pics_state);
kvm_notify_acked_irq(s->pics_state->kvm, SELECT_PIC(irq), irq);
- raw_spin_lock(&s->pics_state->lock);
+ pic_lock(s->pics_state);
}
void kvm_pic_clear_isr_ack(struct kvm *kvm)
{
struct kvm_pic *s = pic_irqchip(kvm);
- raw_spin_lock(&s->lock);
+ pic_lock(s);
s->pics[0].isr_ack = 0xff;
s->pics[1].isr_ack = 0xff;
- raw_spin_unlock(&s->lock);
+ pic_unlock(s);
}
/*
@@ -157,9 +180,9 @@ static void ...Seems sane, will apply once I understand why the current code fails. -- error compiling committee.c: too many arguments to function --
Yep. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux --
I see. Won't we hit the same issue when we call pic functions from atomic context during the guest entry sequence? -- error compiling committee.c: too many arguments to function --
If there are such call paths, for sure. What concrete path(s) do you have in mind? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux --
vcpu_enter_guest() -> inject_pending_event() ->
kvm_cpu_{has,get}_interrupt() -> various pic functions if you're unlucky.
--
error compiling committee.c: too many arguments to function
--
But do they kick anyone or just check/pull information? Never saw any warnings during my tests last year (granted: with older -rt and kvm versions). Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux --
Probably not, kicking should be a side effect (or rather the main Well, most guests kill the pic early on. Will apply the patch. -- error compiling committee.c: too many arguments to function --
I think it needs some extension: pic_irq_request should only schedule a wake up on a rising edge of the PIC output. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux --
Mmh, they could if there is > 1 IRQ pending. Guess this just never triggered in real life due to typical APIC use and low IRQ load during PIC times in my tests. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux --
We could just ignore the wakeup in this path. It's called in vcpu context, so obviously the vcpu is awake and kicking it will only hurt your feet. Longer term, we should clear up this mess. One possible path is to move the pic/lapic/injection stuff out of the the critical section, and use vcpu->requests to close the race between querying the pic/lapic and entering the guest. -- error compiling committee.c: too many arguments to function --
Looking at kvm_vcpu_kick, this already happens: The wake queue is checked for pending waiters (ie. non if waking ourself), and no IPI is sent if we run on the same CPU like the VCPU is on. That explains why Sounds worthwhile as well. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux --
