During corner case testing, we noticed that some versions of ehca
do not properly transition to interrupt done in special load situations.
This can be resolved by periodically triggering EOI through H_EOI,
if eqes are pending.
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
---
This patch replaces my previous patch-set.
As Paul suggested, this version of the patch calls H_EOI directly and doesn't need
any ibmebus changes.
drivers/infiniband/hw/ehca/ehca_main.c | 11 +++++++++--
drivers/infiniband/hw/ehca/hcp_if.c | 11 +++++++++++
drivers/infiniband/hw/ehca/hcp_if.h | 1 +
3 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
index 482103e..add4ff4 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -937,6 +937,7 @@ static struct of_platform_driver ehca_driver = {
void ehca_poll_eqs(unsigned long data)
{
struct ehca_shca *shca;
+ u64 ret;
spin_lock(&shca_list_lock);
list_for_each_entry(shca, &shca_list, shca_list) {
@@ -955,8 +956,14 @@ void ehca_poll_eqs(unsigned long data)
spin_unlock_irqrestore(&eq->spinlock, flags);
max--;
} while (q_ofs == q_ofs2 && max > 0);
- if (q_ofs == q_ofs2)
- ehca_process_eq(shca, 0);
+ if (q_ofs == q_ofs2) {
+ ret = hipz_h_eoi(eq->ist);
+ if (ret != H_SUCCESS)
+ ehca_err(&shca->ib_device,
+ "bad return code EOI -"
+ "rc = %ld\n", ret);
+ tasklet_hi_schedule(&shca->eq.interrupt_task);
+ }
}
}
mod_timer(&poll_eqs_timer, round_jiffies(jiffies + HZ));
diff --git a/drivers/infiniband/hw/ehca/hcp_if.c b/drivers/infiniband/hw/ehca/hcp_if.c
index 5245e13..7084efd 100644
--- a/drivers/infiniband/hw/ehca/hcp_if.c
+++ b/drivers/infiniband/hw/ehca/hcp_if.c
@@ -933,3 +933,14 @@ u64 hipz_h_error_data(const struct ipz_adapter_handle adapter_handle,
r_cb,
0, 0, 0, 0);
}
+
+u64 ...> During corner case testing, we noticed that some versions of ehca > do not properly transition to interrupt done in special load situations. > This can be resolved by periodically triggering EOI through H_EOI, > if eqes are pending. So just to be clear: this is a workaround for a hardware/firmware bug? - R. --
> > So just to be clear: this is a workaround for a hardware/firmware bug?
> Yes it is.
OK, so paulus et al... does it seem like a good approach to call H_EOI
from driver code (given that this driver makes tons of other hcalls)?
How critical is this? Since you said "corner case testing" I suspect we
can defer this to 2.6.27 and maybe get it into -stable later?
Also, out of curiousity:
> +u64 hipz_h_eoi(int irq)
> +{
> + int value;
> + unsigned long xirr;
> +
> + iosync();
what is the iosync() required for here?
> + value = (0xff << 24) | irq;
> + xirr = value & 0xffffffff;
given that irq and value are ints, is there any possible way value could
have bits outside of the low 32 set? If you're worried about sign
extension isn't it simpler to just make value unsigned?
> + return plpar_hcall_norets(H_EOI, xirr);
> +}
ie why not:
u64 hipz_h_eoi(int irq)
{
unsigned xirr;
iosync();
xirr = (0xff << 24) | irq;
return plpar_hcall_norets(H_EOI, xirr);
}
--
If this workaround is approved by the relevant hardware & firmware folks, and demonstrably fixes the bug, then I am happy with it. Paul. --
Hi Roland, Yeah, you are rigth I will change that with the final patch. I will send the final patch soon. regards Stefan --
This patch is fine with me as long as the FW/HW people can confirm that calling spurrious EOI's like that will not affect other interrupts. The side effect of writing 0xff to the xirr should be irrelevant as long as this is not done from within a HW interrupt handler (timer interrupts or softirqs are fine). Due to the already incestuous relationship between HCA and the hypervisor, I don't mind having the H call directly in the driver. So as long as the FW/HW people are ok with that workaround, then it has my ack as well. Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- --
If this is OKed by the hypervisor team, then you can add: Acked-by: Paul Mackerras <paulus@samba.org> Paul. --
