Re: [Stable-review] [19/39] e100: Use pci pool to work around GFP_ATOMIC order 5 memory allocation failure

Previous thread: [20/39] e100: Fix broken cbs accounting due to missing memset. by Greg KH on Tuesday, January 5, 2010 - 1:02 pm. (1 message)

Next thread: [12/39] SCSI: fc class: fix fc_transport_init error handling by Greg KH on Tuesday, January 5, 2010 - 1:02 pm. (1 message)
From: Greg KH
Date: Tuesday, January 5, 2010 - 1:02 pm

2.6.31-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Roger Oksanen <roger.oksanen@cs.helsinki.fi>

commit 98468efddb101f8a29af974101c17ba513b07be1 upstream.

pci_alloc_consistent uses GFP_ATOMIC allocation that may fail on some systems
with limited memory (Bug #14265). pci_pool_alloc allows waiting with
GFP_KERNEL.

Tested-by: Karol Lewandowski <karol.k.lewandowski@gmail.com>
Signed-off-by: Roger Oksanen <roger.oksanen@cs.helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/net/e100.c |   17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -156,6 +156,7 @@
 #include <linux/init.h>
 #include <linux/pci.h>
 #include <linux/dma-mapping.h>
+#include <linux/dmapool.h>
 #include <linux/netdevice.h>
 #include <linux/etherdevice.h>
 #include <linux/mii.h>
@@ -601,6 +602,7 @@ struct nic {
 	struct mem *mem;
 	dma_addr_t dma_addr;
 
+	struct pci_pool *cbs_pool;
 	dma_addr_t cbs_dma_addr;
 	u8 adaptive_ifs;
 	u8 tx_threshold;
@@ -1779,9 +1781,7 @@ static void e100_clean_cbs(struct nic *n
 			nic->cb_to_clean = nic->cb_to_clean->next;
 			nic->cbs_avail++;
 		}
-		pci_free_consistent(nic->pdev,
-			sizeof(struct cb) * nic->params.cbs.count,
-			nic->cbs, nic->cbs_dma_addr);
+		pci_pool_free(nic->cbs_pool, nic->cbs, nic->cbs_dma_addr);
 		nic->cbs = NULL;
 		nic->cbs_avail = 0;
 	}
@@ -1799,8 +1799,8 @@ static int e100_alloc_cbs(struct nic *ni
 	nic->cb_to_use = nic->cb_to_send = nic->cb_to_clean = NULL;
 	nic->cbs_avail = 0;
 
-	nic->cbs = pci_alloc_consistent(nic->pdev,
-		sizeof(struct cb) * count, &nic->cbs_dma_addr);
+	nic->cbs = pci_pool_alloc(nic->cbs_pool, GFP_KERNEL,
+				  &nic->cbs_dma_addr);
 	if (!nic->cbs)
 		return -ENOMEM;
 
@@ -2827,7 +2827,11 @@ static int __devinit e100_probe(struct p
 		DPRINTK(PROBE, ERR, "Cannot register net device, ...
From: Stephen Hemminger
Date: Monday, March 15, 2010 - 2:29 pm

On Tue, 05 Jan 2010 12:02:15 -0800


These two e100 patches in 2.6.31.10 (and 2.6.32) caused kernel panic on one customer
system. I recommend they be reverted in next --stable update.



The user system configuration and backtrace were.
---
# lspci -v
00:00.0 Host bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333]
        Subsystem: ASUSTeK Computer Inc. A7V266-E Mainboard
        Flags: bus master, medium devsel, latency 0
        Memory at fc000000 (32-bit, prefetchable) [size=32M]
        Capabilities: [a0] AGP version 2.0
        Capabilities: [c0] Power Management version 2
        Kernel driver in use: agpgart-via
        Kernel modules: via-agp

00:01.0 PCI bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333 AGP] (prog-if 00 [Normal decode])
        Flags: bus master, 66MHz, medium devsel, latency 0
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        Capabilities: [80] Power Management version 2
        Kernel modules: shpchp

00:05.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
        Subsystem: ASUSTeK Computer Inc. CMI8738 6ch-MX
        Flags: bus master, stepping, medium devsel, latency 32, IRQ 11
        I/O ports at d800 [size=256]
        Capabilities: [c0] Power Management version 2

00:06.0 Mass storage controller: Promise Technology, Inc. PDC20265 (FastTrak100 Lite/Ultra100) (rev 02)
        Subsystem: Promise Technology, Inc. Ultra100
        Flags: bus master, medium devsel, latency 32, IRQ 12
        I/O ports at d400 [size=8]
        I/O ports at d000 [size=4]
        I/O ports at b800 [size=8]
        I/O ports at b400 [size=4]
        I/O ports at b000 [size=64]
        Memory at fb000000 (32-bit, non-prefetchable) [size=128K]
        [virtual] Expansion ROM at 80020000 [disabled] [size=64K]
        Capabilities: [58] Power Management version 1
        Kernel driver in use: pata_pdc202xx_old
        Kernel modules: pata_pdc202xx_old

00:0c.0 VGA compatible controller: ...

From: Stephen Hemminger <shemminger@vyatta.com>

There was a subsequent fix that explicitly zeros out the memory.
The problem was that whilst pci_alloc_consistent() zeros out
the memory it returns, the pci pool stuff does not.

So please get that fix sent to -stable instead of the revert.
For reference:

commit 70abc8cb90e679d8519721e2761d8366a18212a6
Author: Roger Oksanen <roger.oksanen@cs.helsinki.fi>
Date:   Fri Dec 18 20:18:21 2009 -0800

    e100: Fix broken cbs accounting due to missing memset.
    
    Alan Stern noticed that e100 caused slab corruption.
    commit 98468efddb101f8a29af974101c17ba513b07be1 changed
    the allocation of cbs to use dma pools that don't return zeroed memory,
    especially the cb->status field used to track which cb to clean, causing
    (the visible) double freeing of skbs and a wrong free cbs count.
    
    Now the cbs are explicitly zeroed at allocation time.
    
    Reported-by: Alan Stern <stern@rowland.harvard.edu>
    Tested-by: Alan Stern <stern@rowland.harvard.edu>
    Signed-off-by: Roger Oksanen <roger.oksanen@cs.helsinki.fi>
    Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

--

From: Stephen Hemminger
Date: Monday, March 15, 2010 - 2:36 pm

On Mon, 15 Mar 2010 14:32:25 -0700 (PDT)

The kernel has both fixes in it. The customer reported that if both
were reverted, the kernel panic went away.


commit 1bfc1db036675e61af0ea34d3ac18206de566b64
Author: Roger Oksanen <roger.oksanen@cs.helsinki.fi>
Date:   Fri Dec 18 20:18:21 2009 -0800

    e100: Fix broken cbs accounting due to missing memset.
    
    commit 70abc8cb90e679d8519721e2761d8366a18212a6 upstream.
    
    Alan Stern noticed that e100 caused slab corruption.
    commit 98468efddb101f8a29af974101c17ba513b07be1 changed
    the allocation of cbs to use dma pools that don't return zeroed memory,
    especially the cb->status field used to track which cb to clean, causing
    (the visible) double freeing of skbs and a wrong free cbs count.
    
    Now the cbs are explicitly zeroed at allocation time.
    
    Reported-by: Alan Stern <stern@rowland.harvard.edu>
    Tested-by: Alan Stern <stern@rowland.harvard.edu>
    Signed-off-by: Roger Oksanen <roger.oksanen@cs.helsinki.fi>
    Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

commit 550b1d3896894543cc13dafe6910119024177482
Author: Roger Oksanen <roger.oksanen@cs.helsinki.fi>
Date:   Sun Nov 29 17:17:29 2009 -0800

    e100: Use pci pool to work around GFP_ATOMIC order 5 memory allocation failu
    
    commit 98468efddb101f8a29af974101c17ba513b07be1 upstream.
    
    pci_alloc_consistent uses GFP_ATOMIC allocation that may fail on some system
    with limited memory (Bug #14265). pci_pool_alloc allows waiting with
    GFP_KERNEL.
    
    Tested-by: Karol Lewandowski <karol.k.lewandowski@gmail.com>
    Signed-off-by: Roger Oksanen <roger.oksanen@cs.helsinki.fi>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


--


From: Stephen Hemminger <shemminger@vyatta.com>

Please recheck that as your backtrace matches exactly the
crash signature fixed by the memset().
--


From: David Miller <davem@davemloft.net>

As an update, after some auditing I found that ring parameter changes
aren't handled correctly by the PCI pool changes and that might
explain the crash.

I'll push the following fix around as soon as possible:

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index a26ccab..b997e57 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -2858,7 +2858,7 @@ static int __devinit e100_probe(struct pci_dev *pdev,
 	}
 	nic->cbs_pool = pci_pool_create(netdev->name,
 			   nic->pdev,
-			   nic->params.cbs.count * sizeof(struct cb),
+			   nic->params.cbs.max * sizeof(struct cb),
 			   sizeof(u32),
 			   0);
 	DPRINTK(PROBE, INFO, "addr 0x%llx, irq %d, MAC addr %pM\n",
--

From: Stephen Hemminger
Date: Monday, March 15, 2010 - 3:25 pm

On Mon, 15 Mar 2010 15:20:37 -0700 (PDT)

I will cherry pick it back into our kernel for validation.
--

Previous thread: [20/39] e100: Fix broken cbs accounting due to missing memset. by Greg KH on Tuesday, January 5, 2010 - 1:02 pm. (1 message)

Next thread: [12/39] SCSI: fc class: fix fc_transport_init error handling by Greg KH on Tuesday, January 5, 2010 - 1:02 pm. (1 message)