Re: [RFC PATCH 03/11] ppc: Create ops to choose between direct window and iommu based on device mask

Previous thread: Firmware versioning best practices: ath3k-2.fw rename or replace ath3k-1.fw ? by Luis R. Rodriguez on Friday, October 8, 2010 - 10:02 am. (14 messages)

Next thread: [BUGFIX] memcg CPU hotplug lockdep warning fix by Balbir Singh on Friday, October 8, 2010 - 10:49 am. (4 messages)
From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

Hi,

The following series, which builds upon the series of cleanups I posted
on 9/15 as "ppc iommu cleanups", enables the pseries firmware feature
dynamic dma windows. This feature will allow future devices to have a
64-bit DMA mapping covering all memory, coexisting with a smaller IOMMU
window in 32-bit PCI space.

Comments requested and welcome!

Nishanth Aravamudan (11):
  macio: ensure all dma routines get copied over
  ppc: allow direct and iommu to coexist
  ppc: Create ops to choose between direct window and iommu based on
    device mask
  ppc: add memory_hotplug_max
  ppc: do not search for dma-window property on dlpar remove
  ppc: checking for pdn->parent is redundant
  ppc/iommu: do not need to check for dma_window == NULL
  ppc/iommu: remove unneeded pci_dma_bus_setup_pSeriesLP
  ppc/iommu: pass phb only to iommu_table_setparms_lpar
  ppc/iommu: add routines to pseries iommu to map tces 1-1
  ppc: add dynamic dma window support

 arch/powerpc/include/asm/device.h      |   20 +-
 arch/powerpc/include/asm/dma-mapping.h |    6 +-
 arch/powerpc/include/asm/iommu.h       |    6 +-
 arch/powerpc/include/asm/mmzone.h      |    5 +
 arch/powerpc/kernel/Makefile           |    2 +-
 arch/powerpc/kernel/dma-choose64.c     |  167 +++++++++++
 arch/powerpc/mm/numa.c                 |   26 ++
 arch/powerpc/platforms/pseries/iommu.c |  475 +++++++++++++++++++++++++++----
 drivers/macintosh/macio_asic.c         |    7 +-
 9 files changed, 635 insertions(+), 79 deletions(-)
 create mode 100644 arch/powerpc/kernel/dma-choose64.c

--

From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

Also add a comment to dev_archdata, indicating that changes there need
to be verified against the driver code.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
 arch/powerpc/include/asm/device.h |    6 ++++++
 drivers/macintosh/macio_asic.c    |    7 +++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/device.h b/arch/powerpc/include/asm/device.h
index a3954e4..16d25c0 100644
--- a/arch/powerpc/include/asm/device.h
+++ b/arch/powerpc/include/asm/device.h
@@ -9,6 +9,12 @@
 struct dma_map_ops;
 struct device_node;
 
+/*
+ * Arch extensions to struct device.
+ *
+ * When adding fields, consider macio_add_one_device in
+ * drivers/macintosh/macio_asic.c
+ */
 struct dev_archdata {
 	/* DMA operations on that device */
 	struct dma_map_ops	*dma_ops;
diff --git a/drivers/macintosh/macio_asic.c b/drivers/macintosh/macio_asic.c
index b6e7ddc..18bf7a9 100644
--- a/drivers/macintosh/macio_asic.c
+++ b/drivers/macintosh/macio_asic.c
@@ -387,11 +387,10 @@ static struct macio_dev * macio_add_one_device(struct macio_chip *chip,
 	/* Set the DMA ops to the ones from the PCI device, this could be
 	 * fishy if we didn't know that on PowerMac it's always direct ops
 	 * or iommu ops that will work fine
+         *
+         * To get all the fields, copy all archdata
 	 */
-	dev->ofdev.dev.archdata.dma_ops =
-		chip->lbus.pdev->dev.archdata.dma_ops;
-	dev->ofdev.dev.archdata.dma_data =
-		chip->lbus.pdev->dev.archdata.dma_data;
+        dev->ofdev.dev.archdata = chip->lbus.pdev->dev.archdata;
 #endif /* CONFIG_PCI */
 
 #ifdef DEBUG
-- 
1.7.1

--

From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

Replace the union with just the multiple fields, ifdef on CONFIG_PPC64.

Future pseries boxes will allow a 64 bit dma mapping covering all
memory, coexisting with a smaller iommu window in 32 bit pci space.

The cell fixed mapping would also like both to coexist.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
I used the ifdef guard of CONFIG_PPC64 according to the current makefile
for iommu.c.  One set is burried in the middle of iommu.h.
---
 arch/powerpc/include/asm/device.h      |   14 ++++++--------
 arch/powerpc/include/asm/dma-mapping.h |    4 ++--
 arch/powerpc/include/asm/iommu.h       |    6 ++++--
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/device.h b/arch/powerpc/include/asm/device.h
index 16d25c0..ed883ea 100644
--- a/arch/powerpc/include/asm/device.h
+++ b/arch/powerpc/include/asm/device.h
@@ -19,14 +19,12 @@ struct dev_archdata {
 	/* DMA operations on that device */
 	struct dma_map_ops	*dma_ops;
 
-	/*
-	 * When an iommu is in use, dma_data is used as a ptr to the base of the
-	 * iommu_table.  Otherwise, it is a simple numerical offset.
-	 */
-	union {
-		dma_addr_t	dma_offset;
-		void		*iommu_table_base;
-	} dma_data;
+	/* dma_offset is used by swiotlb and direct dma ops, but no iommu */
+	dma_addr_t	dma_offset;
+
+#ifdef CONFIG_PPC64
+	void		*iommu_table_base;
+#endif
 
 #ifdef CONFIG_SWIOTLB
 	dma_addr_t		max_direct_dma_addr;
diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
index 8c9c6ad..644103a 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -100,7 +100,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
 static inline dma_addr_t get_dma_offset(struct device *dev)
 {
 	if (dev)
-		return dev->archdata.dma_data.dma_offset;
+		return dev->archdata.dma_offset;
 
 	return PCI_DRAM_OFFSET;
 }
@@ -108,7 +108,7 @@ ...
From: Benjamin Herrenschmidt
Date: Friday, October 8, 2010 - 4:38 pm

I dislike the ifdef's ...

Also, why remove the union ? IE. Do we really them to co-exist for a
given device ? I'm doing something similar for another (not released
yet) processor where I'm flicking between direct and iommu at
set_dma_mask time, it's easy enough to change the union content.

Cheers,


--

From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

The iommu_table pointer in the pci auxiliary struct of device_node has
not been used by the iommu ops since the dma refactor of
12d04eef927bf61328af2c7cbe756c96f98ac3bf, however this code still uses
it to find tables for dlpar. By only setting the PCI_DN iommu_table
pointer on nodes with dma window properties, we will be able to quickly
find the node for later checks, and can remove the table without looking
for the the dma window property on dlpar remove.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
 arch/powerpc/platforms/pseries/iommu.c |    6 +-----
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 9184db3..8ab32da 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -455,9 +455,6 @@ static void pci_dma_bus_setup_pSeriesLP(struct pci_bus *bus)
 		ppci->iommu_table = iommu_init_table(tbl, ppci->phb->node);
 		pr_debug("  created table: %p\n", ppci->iommu_table);
 	}
-
-	if (pdn != dn)
-		PCI_DN(dn)->iommu_table = ppci->iommu_table;
 }
 
 
@@ -571,8 +568,7 @@ static int iommu_reconfig_notifier(struct notifier_block *nb, unsigned long acti
 
 	switch (action) {
 	case PSERIES_RECONFIG_REMOVE:
-		if (pci && pci->iommu_table &&
-		    of_get_property(np, "ibm,dma-window", NULL))
+		if (pci && pci->iommu_table)
 			iommu_free_table(pci->iommu_table, np->full_name);
 		break;
 	default:
-- 
1.7.1

--

From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

The work done in pci_dma_bus_setup_pSeriesLP will be done in
pci_dma_dev_setup_pSeriesLP, and therefore we can remove the bus setup
function for lpar.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
 arch/powerpc/platforms/pseries/iommu.c |   43 --------------------------------
 1 files changed, 0 insertions(+), 43 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 9d564b9..d8bb9be 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -417,47 +417,6 @@ static void pci_dma_bus_setup_pSeries(struct pci_bus *bus)
 	pr_debug("ISA/IDE, window size is 0x%llx\n", pci->phb->dma_window_size);
 }
 
-
-static void pci_dma_bus_setup_pSeriesLP(struct pci_bus *bus)
-{
-	struct iommu_table *tbl;
-	struct device_node *dn, *pdn;
-	struct pci_dn *ppci;
-	const void *dma_window = NULL;
-
-	dn = pci_bus_to_OF_node(bus);
-
-	pr_debug("pci_dma_bus_setup_pSeriesLP: setting up bus %s\n",
-		 dn->full_name);
-
-	/* Find nearest ibm,dma-window, walking up the device tree */
-	for (pdn = dn; pdn != NULL; pdn = pdn->parent) {
-		dma_window = of_get_property(pdn, "ibm,dma-window", NULL);
-		if (dma_window != NULL)
-			break;
-	}
-
-	if (dma_window == NULL) {
-		pr_debug("  no ibm,dma-window property !\n");
-		return;
-	}
-
-	ppci = PCI_DN(pdn);
-
-	pr_debug("  parent is %s, iommu_table: 0x%p\n",
-		 pdn->full_name, ppci->iommu_table);
-
-	if (!ppci->iommu_table) {
-		tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
-				   ppci->phb->node);
-		iommu_table_setparms_lpar(ppci->phb, pdn, tbl, dma_window,
-			bus->number);
-		ppci->iommu_table = iommu_init_table(tbl, ppci->phb->node);
-		pr_debug("  created table: %p\n", ppci->iommu_table);
-	}
-}
-
-
 static void pci_dma_dev_setup_pSeries(struct pci_dev *dev)
 {
 	struct device_node *dn;
@@ -547,7 +506,6 @@ static void pci_dma_dev_setup_pSeriesLP(struct ...
From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

Also allow the coherent ops to be iommu if only the coherent mask is too
small, mostly for driver that do not set set the coherent mask but also
don't use the coherent api.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
 arch/powerpc/include/asm/dma-mapping.h |    2 +
 arch/powerpc/kernel/Makefile           |    2 +-
 arch/powerpc/kernel/dma-choose64.c     |  167 ++++++++++++++++++++++++++++++++
 3 files changed, 170 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/kernel/dma-choose64.c

diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
index 644103a..9ffb16a 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -68,6 +68,8 @@ static inline unsigned long device_to_mask(struct device *dev)
  */
 #ifdef CONFIG_PPC64
 extern struct dma_map_ops dma_iommu_ops;
+extern struct dma_map_ops dma_choose64_ops;
+extern struct dma_map_ops dma_iommu_coherent_ops;
 #endif
 extern struct dma_map_ops dma_direct_ops;
 
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 1dda701..21b8ea1 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -82,7 +82,7 @@ obj-y				+= time.o prom.o traps.o setup-common.o \
 				   udbg.o misc.o io.o dma.o \
 				   misc_$(CONFIG_WORD_SIZE).o
 obj-$(CONFIG_PPC32)		+= entry_32.o setup_32.o
-obj-$(CONFIG_PPC64)		+= dma-iommu.o iommu.o
+obj-$(CONFIG_PPC64)		+= dma-iommu.o iommu.o dma-choose64.o
 obj-$(CONFIG_KGDB)		+= kgdb.o
 obj-$(CONFIG_PPC_OF_BOOT_TRAMPOLINE)	+= prom_init.o
 obj-$(CONFIG_MODULES)		+= ppc_ksyms.o
diff --git a/arch/powerpc/kernel/dma-choose64.c b/arch/powerpc/kernel/dma-choose64.c
new file mode 100644
index 0000000..17c716f
--- /dev/null
+++ b/arch/powerpc/kernel/dma-choose64.c
@@ -0,0 +1,167 @@
+/*
+ * Copyright (C) 2006 Benjamin Herrenschmidt, IBM Corporation
+ *
+ * Provide default implementations of the DMA mapping ...
From: Benjamin Herrenschmidt
Date: Friday, October 8, 2010 - 4:43 pm

You are doing the transition at map_sg time which is a hot path, I don't
like that. Also you add all those "choose" variants of the dma ops...
not very nice at all.

You may want to look at the patches I posted to the list a while back


--

From: Benjamin Herrenschmidt
Date: Friday, October 8, 2010 - 4:44 pm

You are doing the transition at map_sg time which is a hot path, I don't
like that. Also you add all those "choose" variants of the dma ops...
not very nice at all.

You may want to look at the patches I posted to the list a while back
for doing direct DMA on Bimini:

[PATCH 1/2] powerpc/dma: Add optional platform override of dma_set_mask()
[PATCH 2/2] powerpc/dart_iommu: Support for 64-bit iommu bypass window on PCIe

Cheers,



--

From: FUJITA Tomonori
Date: Sunday, October 10, 2010 - 8:09 am

On Sat, 09 Oct 2010 10:44:53 +1100


Would it be cleaner if each ppc dma_map_ops has the own set_dma_mask



--

From: Benjamin Herrenschmidt
Date: Sunday, October 10, 2010 - 4:41 pm

I'm not sure I parse what you wrote above :-)

I did try with various methods back then, and what ended up sucking the
less was basically to hookup dma_set_mask() at the arch level.

In fact, it makes sense to the extent that the arch is the one that
knows that there are multiple regions configured potentially with
different capabilities.

You can still do the switch within the dma_ops->set_dma_mask if you want
I suppose, especially if you end up hitting different attribute regions
within a single bus or such but from my experience, it gets really hacky

Cheers,
Ben.


--

From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
 arch/powerpc/platforms/pseries/iommu.c |   98 ++++++++++++++++++++++++++++++++
 1 files changed, 98 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 8ec81df..451d2d1 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -270,6 +270,104 @@ static unsigned long tce_get_pSeriesLP(struct iommu_table *tbl, long tcenum)
 	return tce_ret;
 }
 
+/* this is compatable with cells for the device tree property */
+struct dynamic_dma_window_prop {
+	__be32	liobn;		/* tce table number */
+	__be32	dma_base[2];	/* address hi,lo */
+	__be32	tce_shift;	/* ilog2(tce_page_size) */
+	__be32	window_shift;	/* ilog2(tce_window_size) */
+};
+
+static int tce_clearrange_multi_pSeriesLP(unsigned long start_pfn,
+					unsigned long num_pfn, void *arg)
+{
+	struct dynamic_dma_window_prop *maprange = arg;
+	int rc;
+	u64 tce_size, num_tce, dma_offset;
+	u32 tce_shift;
+
+	tce_shift = be32_to_cpu(maprange->tce_shift);
+	tce_size = 1ULL << tce_shift;
+	num_tce = num_pfn << PAGE_SHIFT;
+	dma_offset = start_pfn << PAGE_SHIFT;
+
+	/* round back to the beginning of the tce page size */
+	num_tce += dma_offset & (tce_size - 1);
+	dma_offset &= ~(tce_size - 1);
+
+	/* covert to number of tces */
+	num_tce |= tce_size - 1;
+	num_tce >>= tce_shift;
+
+	rc = plpar_tce_stuff(maprange->liobn, dma_offset, 0, num_tce);
+
+	return rc;
+}
+
+static int tce_setrange_multi_pSeriesLP(unsigned long start_pfn,
+					unsigned long num_pfn, void *arg)
+{
+	struct dynamic_dma_window_prop *maprange = arg;
+	u64 *tcep, tce_size, num_tce, dma_offset, next, proto_tce;
+	u32 tce_shift;
+	long rc = 0;
+	long l, limit;
+
+	local_irq_disable();	/* to protect tcep and the page behind it */
+	tcep = __get_cpu_var(tce_page);
+
+	if (!tcep) {
+		tcep = (u64 ...
From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

iommu_table_setparms_lpar needs either the phb or the subbusnumber
(not both), pass the phb to make it similar to iommu_table_setparms.

Note: In cases where a caller was passing bus->number previously to
iommu_table_setparms_lpar() rather than phb->bus->number, this can lead
to a different value in tbl->it_busno. The only example of this was the
removed pci_dma_dev_setup_pSeriesLP(), removed in "ppc/iommu: remove
unneeded pci_dma_dev_setup_pSeriesLP".

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
 arch/powerpc/platforms/pseries/iommu.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index d8bb9be..8ec81df 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -323,14 +323,13 @@ static void iommu_table_setparms(struct pci_controller *phb,
 static void iommu_table_setparms_lpar(struct pci_controller *phb,
 				      struct device_node *dn,
 				      struct iommu_table *tbl,
-				      const void *dma_window,
-				      int bussubno)
+				      const void *dma_window)
 {
 	unsigned long offset, size;
 
-	tbl->it_busno  = bussubno;
 	of_parse_dma_window(dn, dma_window, &tbl->it_index, &offset, &size);
 
+	tbl->it_busno = phb->bus->number;
 	tbl->it_base   = 0;
 	tbl->it_blocksize  = 16;
 	tbl->it_type = TCE_PCI;
@@ -493,8 +492,7 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
 	if (!pci->iommu_table) {
 		tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
 				   pci->phb->node);
-		iommu_table_setparms_lpar(pci->phb, pdn, tbl, dma_window,
-			pci->phb->bus->number);
+		iommu_table_setparms_lpar(pci->phb, pdn, tbl, dma_window);
 		pci->iommu_table = iommu_init_table(tbl, pci->phb->node);
 		pr_debug("  created table: %p\n", pci->iommu_table);
 	} else {
-- 
1.7.1

--

From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

If firmware allows us to map all of a partition's memory for DMA on a
particular bridge, create a 1:1 mapping of that memory. Add hooks for
dealing with hotplug events. Dyanmic DMA windows can use larger than the
default page size, and we use the largest one possible.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
 arch/powerpc/platforms/pseries/iommu.c |  319 +++++++++++++++++++++++++++++++-
 1 files changed, 315 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 451d2d1..23ca0d1 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -33,6 +33,7 @@
 #include <linux/pci.h>
 #include <linux/dma-mapping.h>
 #include <linux/crash_dump.h>
+#include <linux/memory.h>
 #include <asm/io.h>
 #include <asm/prom.h>
 #include <asm/rtas.h>
@@ -45,6 +46,7 @@
 #include <asm/tce.h>
 #include <asm/ppc-pci.h>
 #include <asm/udbg.h>
+#include <asm/mmzone.h>
 
 #include "plpar_wrappers.h"
 
@@ -278,10 +280,19 @@ struct dynamic_dma_window_prop {
 	__be32	window_shift;	/* ilog2(tce_window_size) */
 };
 
+struct direct_window {
+	struct device_node *device;
+	const struct dynamic_dma_window_prop *prop;
+	struct list_head list;
+};
+static LIST_HEAD(direct_window_list);
+static DEFINE_SPINLOCK(direct_window_list_lock);
+#define DIRECT64_PROPNAME "linux,direct64-ddr-window-info"
+
 static int tce_clearrange_multi_pSeriesLP(unsigned long start_pfn,
-					unsigned long num_pfn, void *arg)
+					unsigned long num_pfn, const void *arg)
 {
-	struct dynamic_dma_window_prop *maprange = arg;
+	const struct dynamic_dma_window_prop *maprange = arg;
 	int rc;
 	u64 tce_size, num_tce, dma_offset;
 	u32 tce_shift;
@@ -305,9 +316,9 @@ static int tce_clearrange_multi_pSeriesLP(unsigned long start_pfn,
 }
 
 static int tce_setrange_multi_pSeriesLP(unsigned long start_pfn,
-					unsigned long num_pfn, void ...
From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

The device tree root is never a pci bus, and will not have a
PCI_DN(pdn), so the check for PCI_DN added in
650f7b3b2f0ead0673e90452cf3dedde97c537ba makes the check for pdn->parent
redundant and it can be removed.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
 arch/powerpc/platforms/pseries/iommu.c |    5 +----
 1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 8ab32da..0ae5a60 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -530,10 +530,7 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
 	}
 	pr_debug("  parent is %s\n", pdn->full_name);
 
-	/* Check for parent == NULL so we don't try to setup the empty EADS
-	 * slots on POWER4 machines.
-	 */
-	if (dma_window == NULL || pdn->parent == NULL) {
+	if (dma_window == NULL) {
 		pr_debug("  no dma window for device, linking to parent\n");
 		set_iommu_table_base(&dev->dev, PCI_DN(pdn)->iommu_table);
 		return;
-- 
1.7.1

--

From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

Add a function to get the maximum address that can be hotplug added.
This is needed to calculate the size of the tce table needed to cover
all memory in 1:1 mode.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
Comments on where to export?
---
 arch/powerpc/include/asm/mmzone.h |    5 +++++
 arch/powerpc/mm/numa.c            |   26 ++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/mmzone.h b/arch/powerpc/include/asm/mmzone.h
index aac87cb..fd3fd58 100644
--- a/arch/powerpc/include/asm/mmzone.h
+++ b/arch/powerpc/include/asm/mmzone.h
@@ -33,6 +33,9 @@ extern int numa_cpu_lookup_table[];
 extern cpumask_var_t node_to_cpumask_map[];
 #ifdef CONFIG_MEMORY_HOTPLUG
 extern unsigned long max_pfn;
+u64 memory_hotplug_max(void);
+#else
+#define memory_hotplug_max() memblock_end_of_DRAM()
 #endif
 
 /*
@@ -42,6 +45,8 @@ extern unsigned long max_pfn;
 #define node_start_pfn(nid)	(NODE_DATA(nid)->node_start_pfn)
 #define node_end_pfn(nid)	(NODE_DATA(nid)->node_end_pfn)
 
+#else
+#define memory_hotplug_max() memblock_end_of_DRAM()
 #endif /* CONFIG_NEED_MULTIPLE_NODES */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 002878c..f98b0d2 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1246,4 +1246,30 @@ int hot_add_scn_to_nid(unsigned long scn_addr)
 	return nid;
 }
 
+static u64 hot_add_drconf_memory_max(void)
+{
+        struct device_node *memory = NULL;
+        unsigned int drconf_cell_cnt = 0;
+        u64 lmb_size = 0;
+        const u32 *dm = 0;
+
+        memory = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
+        if (memory) {
+                drconf_cell_cnt = of_get_drconf_memory(memory, &dm);
+                lmb_size = of_get_lmb_size(memory);
+                of_node_put(memory);
+        }
+        return lmb_size * drconf_cell_cnt;
+}
+
+/*
+ ...
From: Nishanth Aravamudan
Date: Friday, October 8, 2010 - 10:33 am

The block in pci_dma_dev_setup_pSeriesLP for dma_window == NULL can be
removed because we will only teminate the loop if we had already allocated
a iommu table for that node or we found a window.  While there may be
no window for the device, the intresting part is if we are reusing a
table or creating it for the first device under it.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
---
 arch/powerpc/platforms/pseries/iommu.c |    6 ------
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 0ae5a60..9d564b9 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -530,12 +530,6 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
 	}
 	pr_debug("  parent is %s\n", pdn->full_name);
 
-	if (dma_window == NULL) {
-		pr_debug("  no dma window for device, linking to parent\n");
-		set_iommu_table_base(&dev->dev, PCI_DN(pdn)->iommu_table);
-		return;
-	}
-
 	pci = PCI_DN(pdn);
 	if (!pci->iommu_table) {
 		tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
-- 
1.7.1

--

Previous thread: Firmware versioning best practices: ath3k-2.fw rename or replace ath3k-1.fw ? by Luis R. Rodriguez on Friday, October 8, 2010 - 10:02 am. (14 messages)

Next thread: [BUGFIX] memcg CPU hotplug lockdep warning fix by Balbir Singh on Friday, October 8, 2010 - 10:49 am. (4 messages)