Re: [PATCH 1/2] pci/irq: restore mask_bits in msi shutdown -v3

Previous thread: GOOD DAY by SENATE HOUSE on Tuesday, April 22, 2008 - 8:34 pm. (1 message)

Next thread: [PATCH] mutex-debug: Check mutex magic before owner by Jeremy Kerr on Tuesday, April 22, 2008 - 10:38 pm. (1 message)
From: Yinghai Lu
Date: Tuesday, April 22, 2008 - 9:48 pm

this change

| commit 23a274c8a5adafc74a66f16988776fc7dd6f6e51
| Author: Prakash, Sathya <sathya.prakash@lsi.com>
| Date:   Fri Mar 7 15:53:21 2008 +0530
|
|     [SCSI] mpt fusion: Enable MSI by default for SAS controllers
|
|     This patch modifies the driver to enable MSI by default for all SAS chips.
|
|     Signed-off-by: Sathya Prakash <sathya.prakash@lsi.com>
|     Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
cause kexec RHEL 5.1 kernel fail.

root casue: the rhel 5.1 kernel still use INTx emulation.
and mptscsih_shutdown doesn't call pci_disable_msi to reenable INTx on kexec path

so try to call pci_disable_msi in shutdown patch

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>

Index: linux-2.6/drivers/pci/pci-driver.c
===================================================================
--- linux-2.6.orig/drivers/pci/pci-driver.c
+++ linux-2.6/drivers/pci/pci-driver.c
@@ -360,6 +360,8 @@ static void pci_device_shutdown(struct d
 
 	if (drv && drv->shutdown)
 		drv->shutdown(pci_dev);
+	pci_disable_msi_simple(pci_dev);
+	pci_disable_msix_simple(pci_dev);
 }
 
 /**
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -722,9 +722,11 @@ static inline void pci_restore_msi_state
 { }
 #else
 extern int pci_enable_msi(struct pci_dev *dev);
+extern void pci_disable_msi_simple(struct pci_dev *dev);
 extern void pci_disable_msi(struct pci_dev *dev);
 extern int pci_enable_msix(struct pci_dev *dev,
 	struct msix_entry *entries, int nvec);
+extern void pci_disable_msix_simple(struct pci_dev *dev);
 extern void pci_disable_msix(struct pci_dev *dev);
 extern void msi_remove_pci_irq_vectors(struct pci_dev *dev);
 extern void pci_restore_msi_state(struct pci_dev *dev);
Index: linux-2.6/drivers/pci/msi.c
===================================================================
--- ...
From: Michael Ellerman
Date: Tuesday, April 22, 2008 - 10:24 pm

How is kdump going to work? Your shutdown routine won't be called and
you'll have the same problem in the 2nd kernel, won't you?

cheers

--=20
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
From: Yinghai Lu
Date: Tuesday, April 22, 2008 - 11:08 pm

On Tue, Apr 22, 2008 at 10:24 PM, Michael Ellerman

for kdump, the first kernel and second kernel should be same version.

this one is for using kexec to load other kernel that may not have msi
support etc.

YH
--

From: Eric W. Biederman
Date: Wednesday, April 23, 2008 - 6:08 am

Taking a quick look our current msi initialization appears robust in
not assuming the state of the msi config bits.

So the only remaining problem is running older software that 
assumes the msi config bits are in the state they should be in
out of reset.

YH on that score it appears I goofed a little when I gave you
my suggestion on how to fix this in pci_disable_msi.

If we have crazy hardware that supports multi irqs in with
a plain msi capability.  During initialization we mask
all of the irqs.

from msi_capability_init:
        if (entry->msi_attrib.maskbit) {
                unsigned int maskbits, temp;
                /* All MSIs are unmasked by default, Mask them all */
                pci_read_config_dword(dev,
                        msi_mask_bits_reg(pos, is_64bit_address(control)),
                        &maskbits);
                temp = (1 << multi_msi_capable(control));
                temp = ((temp - 1) & ~temp);
                maskbits |= temp;
                pci_write_config_dword(dev,
                        msi_mask_bits_reg(pos, is_64bit_address(control)),
                        maskbits);
        }

So it appears to truly return to the reset state we should unmask
them all, instead of just that one.  Not that it matters in practice,
but handling that corner case would be polite.

Eric
--

From: Yinghai Lu
Date: Wednesday, April 23, 2008 - 10:31 am

On Wed, Apr 23, 2008 at 6:08 AM, Eric W. Biederman

will extend that a little bit.

YH
--

From: Michael Ellerman
Date: Wednesday, April 23, 2008 - 5:17 pm

But does that help us? What if the device driver in the 2nd kernel
assumes it's using INTX, when in fact MSI is enabled on the device. In
that case none of the MSI code will even be called AFAIK.

cheers

--=20
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
From: Yinghai Lu
Date: Wednesday, April 23, 2008 - 6:22 pm

On Wed, Apr 23, 2008 at 5:17 PM, Michael Ellerman

ok. for kdump case, if driver can not use msi (?), with pci=nomsi
we need to call pci_intx_for_msi(dev, 1) at beginning of second kernel.

will produce another patch about that case.

YH
--

From: Yinghai Lu
Date: Wednesday, April 23, 2008 - 7:11 pm

attached kernel should fix second kernel by kdump doesn't have msi
support problem.

anyway kexec path doesn't need attached one, because shutdown in first
kernel already did pci_msi_shutdown and it called pci_initx_for_msi

also long term should teach every driver call pci_intx_for_msi(dev, 1)?

YH
From: Eric W. Biederman
Date: Wednesday, April 23, 2008 - 5:57 am

Ok this looks like a reasonable approach.

Could you please change how this is factored.
And implement a pci_shutdown_msi and a pci_shutdown_msix that
just performs the hardware state change.

Then have pci_disable_msi and pci_disable_msix call them?

That should be much easier to maintain then a adding a function
that takes a magic flag. 

That is the design of the shutdown interface and it seems to
work well.

Eric
--

From: Yinghai Lu
Date: Wednesday, April 23, 2008 - 10:32 am

On Wed, Apr 23, 2008 at 5:57 AM, Eric W. Biederman

will check that.

YH
--

From: Yinghai Lu
Date: Wednesday, April 23, 2008 - 2:56 pm

Yinghai found after using 2.6.25-rc3 later to kexec RHEL 5.1,
NIC can not be used.

bisected to

| commit 89d694b9dbe769ca1004e01db0ca43964806a611
| Author: Thomas Gleixner <tglx@linutronix.de>
| Date:   Mon Feb 18 18:25:17 2008 +0100
|
|   genirq: do not leave interupts enabled on free_irq
|
|   The default_disable() function was changed in commit:
|
|    76d2160147f43f982dfe881404cfde9fd0a9da21
|    genirq: do not mask interrupts by default
|
|   It removed the mask function in favour of the default delayed
|   interrupt disabling. Unfortunately this also broke the shutdown in
|   free_irq() when the last handler is removed from the interrupt for
|   those architectures which rely on the default implementations. Now we
|   can end up with a enabled interrupt line after the last handler was
|   removed, which can result in spurious interrupts.
|
|   Fix this by adding a default_shutdown function, which is only
|   installed, when the irqchip implementation does provide neither a
|   shutdown nor a disable function.
|
|   [@stable: affected versions: .21 - .24 ]

for MSI, default_shutdown will call mask_bit for msi device.  so all mask bits
will left disabled after free_irq.  then if kexec next kernel that only can
use msi_enable bit.  all device's MSI can not be used.

want to try to restore MSI mask bits that is saved before using msi in first
kernel.

Eric said:
This is over complicated and for hardware that erroneously triggers
a msi irq after free_irq may have potential problems.

So lets do the much simpler, much safer, and more general method of
restoring the mask bit to it's pci reset defined value (enabled) when
we disable the kernels use of msi.

it will work, because pci_diable_msi is called after free_irq is called.

v3: extend msi_set_mask_bit to msi_set_mask_bits to take mask, so we can fully
restore that to 0x00 instead of 0xfe.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>

Index: ...
From: Yinghai Lu
Date: Wednesday, April 23, 2008 - 2:58 pm

[PATCH 2/2] pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2

this change

| commit 23a274c8a5adafc74a66f16988776fc7dd6f6e51
| Author: Prakash, Sathya <sathya.prakash@lsi.com>
| Date:   Fri Mar 7 15:53:21 2008 +0530
|
|     [SCSI] mpt fusion: Enable MSI by default for SAS controllers
|
|     This patch modifies the driver to enable MSI by default for all SAS chips.
|
|     Signed-off-by: Sathya Prakash <sathya.prakash@lsi.com>
|     Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
cause kexec RHEL 5.1 kernel fail.

root casue: the rhel 5.1 kernel still use INTx emulation.
and mptscsih_shutdown doesn't call pci_disable_msi to reenable INTx on kexec path

so try to call pci_msi_shutdown in shutdown patch
do the same thing to msix

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>

Index: linux-2.6/drivers/pci/pci-driver.c
===================================================================
--- linux-2.6.orig/drivers/pci/pci-driver.c
+++ linux-2.6/drivers/pci/pci-driver.c
@@ -360,6 +360,8 @@ static void pci_device_shutdown(struct d
 
 	if (drv && drv->shutdown)
 		drv->shutdown(pci_dev);
+	pci_msi_shutdown(pci_dev);
+	pci_msix_shutdown(pci_dev);
 }
 
 /**
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -703,6 +703,8 @@ static inline int pci_enable_msi(struct 
 	return -1;
 }
 
+static inline void pci_msi_shutdown(struct pci_dev *dev)
+{ }
 static inline void pci_disable_msi(struct pci_dev *dev)
 { }
 
@@ -712,6 +714,8 @@ static inline int pci_enable_msix(struct
 	return -1;
 }
 
+static inline void pci_msix_shutdown(struct pci_dev *dev)
+{ }
 static inline void pci_disable_msix(struct pci_dev *dev)
 { }
 
@@ -722,9 +726,11 @@ static inline void pci_restore_msi_state
 { }
 #else
 extern int pci_enable_msi(struct pci_dev *dev);
+extern void pci_msi_shutdown(struct pci_dev *dev);
 ...
From: Yinghai Lu
Date: Thursday, April 24, 2008 - 5:40 pm

andrew,

this one should replace

[PATCH] x86_64: restore mask_bits in msi shutdown

in -mm

YH
--

From: Jesse Barnes
Date: Friday, April 25, 2008 - 2:48 pm

Hi Yinghai, I've been thinking about these patches a bit...  They seem like an 
important bug fix (making sure kexec'd kernels work), but I'm a bit worried 
that the kexec'd kernel can't handle potentially broken MSI/INTx setups.  
Shouldn't the kexec'd kernel be a bit more robust?  I guess in this case 
you're kexec'ing an old kernel, so there's not much we can do, but it still 
makes me a little uneasy.

I guess for this particular set it doesn't matter much, since we should be 
restoring things in pci_msi*_shutdown and pci_shutdown_device either way.  
Can you clean up the changelog a bit and maybe make it more concise?  E.g. we 
probably don't need the whole commit message for the bisect, and we want to 
be clearer about what the failure mode is w/o the changes...

Thanks,
Jesse
--

From: Yinghai Lu
Date: Friday, April 25, 2008 - 3:08 pm

Yes, it is important, and should be in 2.6.25 stable too.


without it, the second kernel can not use the devices, if second
kernel could use MSI but doesn't touch mask_bit.

---
[PATCH 1/2] pci/irq: restore mask_bits in msi shutdown -v3

Yinghai found after using 2.6.25-rc3 later to kexec RHEL 5.1,
NIC can not be used.

bisected to

| commit 89d694b9dbe769ca1004e01db0ca43964806a611
| Author: Thomas Gleixner <tglx@linutronix.de>
| Date:   Mon Feb 18 18:25:17 2008 +0100
|
|   genirq: do not leave interupts enabled on free_irq
|
|   The default_disable() function was changed in commit:
|
|    76d2160147f43f982dfe881404cfde9fd0a9da21
|    genirq: do not mask interrupts by default
|

for MSI, default_shutdown will call mask_bit for msi device.  so all mask bits
will left disabled after free_irq.  then if kexec next kernel that only can
use msi_enable bit.  all device's MSI can not be used.

So lets to restore the mask bit to it's pci reset defined value (enabled) when
we disable the kernels use of msi.

extend msi_set_mask_bit to msi_set_mask_bits to take mask, so we can fully
restore that to 0x00 instead of 0xfe.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>

---

YH
--

From: Jesse Barnes
Date: Tuesday, April 29, 2008 - 9:13 am

Applied both (fixed up the changelog a little though), thanks.

Jesse
--

Previous thread: GOOD DAY by SENATE HOUSE on Tuesday, April 22, 2008 - 8:34 pm. (1 message)

Next thread: [PATCH] mutex-debug: Check mutex magic before owner by Jeremy Kerr on Tuesday, April 22, 2008 - 10:38 pm. (1 message)