Re: [PATCH] x86, pci-hotplug, calgary / rio: fix EBDA ioremap()

Previous thread: Re: [PATCH 0/2] Traffic control cgroups subsystem by gauri@etri.re.kr on Thursday, September 25, 2008 - 6:07 pm. (1 message)

Next thread: e1000e NVM corruption issue status by Brandeburg, Jesse on Thursday, September 25, 2008 - 6:50 pm. (39 messages)
From: Suresh Siddha
Date: Thursday, September 25, 2008 - 6:43 pm

[patch] ioremap sanity check to catch mapping requests exceeding the BAR sizes

Go through the iomem resource tree to check if any of the ioremap() requests
span more than any slot in the iomem resource tree and do a WARN_ON() if we hit
this check.

This will raise a red-flag, if some driver is mapping more than what
is needed. And hopefully identify possible corruptions much earlier.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
---

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 7955a5a..c0d2c3e 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -169,6 +169,12 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		return (__force void __iomem *)phys_to_virt(phys_addr);
 
 	/*
+	 * Check if the request spans more than any BAR in the iomem resource
+	 * tree.
+	 */
+	WARN_ON(iomem_map_sanity_check(phys_addr, size));
+
+	/*
 	 * Don't allow anybody to remap normal RAM that we're using..
 	 */
 	for (pfn = phys_addr >> PAGE_SHIFT;
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index ee9bcc6..e38b6aa 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -169,6 +169,7 @@ extern struct resource * __devm_request_region(struct device *dev,
 
 extern void __devm_release_region(struct device *dev, struct resource *parent,
 				  resource_size_t start, resource_size_t n);
+extern int iomem_map_sanity_check(resource_size_t addr, unsigned long size);
 
 #endif /* __ASSEMBLY__ */
 #endif	/* _LINUX_IOPORT_H */
diff --git a/kernel/resource.c b/kernel/resource.c
index fc59dcc..d582db3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -827,3 +827,36 @@ static int __init reserve_setup(char *str)
 }
 
 __setup("reserve=", reserve_setup);
+
+/*
+ * Check if the requested addr and size spans more than any slot in the
+ * iomem resource tree.
+ */
+int iomem_map_sanity_check(resource_size_t addr, unsigned long size)
+{
+	struct resource *p = &iomem_resource;
+	int err = ...
From: Ingo Molnar
Date: Friday, September 26, 2008 - 12:39 am

applied to tip/core/resources, thanks Suresh.


i think all the checks you added are precise to the byte and you allow 
all the sensible ioremaps: which nest fully inside a single resource - 
and you reject all the other partial overlap or multiple overlap 
scenarios.

One potential thing to check for would be whether addr+size overlaps a 
4GB boundary? That would almost always be a bug, and it could also cause 
problems with the checks above if resource_t is 32 bits. The ioremap 
code should already prevent it though.

	Ingo
--

From: Yinghai Lu
Date: Friday, September 26, 2008 - 1:10 am

in that case,  BAR should be disabled already.

YH
--

From: Ingo Molnar
Date: Friday, September 26, 2008 - 1:12 am

needed two build fixes (and the printk type checking annoyances keep 
littering the tree):

From 13eb83754b40bf01dc84e52a08d4196d1b719a0e Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Fri, 26 Sep 2008 10:10:12 +0200
Subject: [PATCH] IO resources, x86: ioremap sanity check to catch mapping requests exceeding, fix

fix this build error:

 kernel/resource.c: In function 'iomem_map_sanity_check':
 kernel/resource.c:842: error: implicit declaration of function 'r_next'
 kernel/resource.c:842: warning: assignment makes pointer from integer without a cast

r_next() was only available if CONFIG_PROCFS was enabled.

and fix this build warning:

 kernel/resource.c:855: warning: format '%llx' expects type 'long long unsigned int', but argument 2 has type 'resource_size_t'
 kernel/resource.c:855: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'long unsigned int'
 kernel/resource.c:855: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'resource_size_t'
 kernel/resource.c:855: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'resource_size_t'

resource_t can be 32 bits.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/resource.c |   14 +++++++++-----
 1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 1d003a5..7797dae 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -38,10 +38,6 @@ EXPORT_SYMBOL(iomem_resource);
 
 static DEFINE_RWLOCK(resource_lock);
 
-#ifdef CONFIG_PROC_FS
-
-enum { MAX_IORES_LEVEL = 5 };
-
 static void *r_next(struct seq_file *m, void *v, loff_t *pos)
 {
 	struct resource *p = v;
@@ -53,6 +49,10 @@ static void *r_next(struct seq_file *m, void *v, loff_t *pos)
 	return p->sibling;
 }
 
+#ifdef CONFIG_PROC_FS
+
+enum { MAX_IORES_LEVEL = 5 };
+
 static void *r_start(struct seq_file *m, loff_t *pos)
 	__acquires(resource_lock)
 {
@@ -852,7 +852,11 ...
From: Ingo Molnar
Date: Friday, September 26, 2008 - 1:35 am

i started testing it in -tip, and it triggered on a quirky PCI hotplug 
driver:

calling  ibmphp_init+0x0/0x360 @ 1
ibmphpd: IBM Hot Plug PCI Controller Driver version: 0.6
resource map sanity check conflict: 0x9f800 0xaf5e7 0x9f800 0x9ffff reserved
------------[ cut here ]------------
WARNING: at arch/x86/mm/ioremap.c:175 __ioremap_caller+0x5c/0x226()
Pid: 1, comm: swapper Not tainted 2.6.27-rc7-tip-00914-g347b10f-dirty #36037
 [<c013a72d>] warn_on_slowpath+0x41/0x68
 [<c0156f00>] ? __lock_acquire+0x9ba/0xa7f
 [<c012158c>] ? do_flush_tlb_all+0x0/0x59
 [<c015ac31>] ? smp_call_function_mask+0x74/0x17d
 [<c012158c>] ? do_flush_tlb_all+0x0/0x59
 [<c013b228>] ? printk+0x1a/0x1c
 [<c013f302>] ? iomem_map_sanity_check+0x82/0x8c
 [<c0a773e8>] ? _read_unlock+0x22/0x25
 [<c013f302>] ? iomem_map_sanity_check+0x82/0x8c
 [<c0154e17>] ? trace_hardirqs_off+0xb/0xd
 [<c0127731>] __ioremap_caller+0x5c/0x226
 [<c0156158>] ? trace_hardirqs_on+0xb/0xd
 [<c012767d>] ? iounmap+0x9d/0xa5
 [<c01279dd>] ioremap_nocache+0x15/0x17
 [<c0403c42>] ? ioremap+0xd/0xf
 [<c0403c42>] ioremap+0xd/0xf
 [<c0f1928f>] ibmphp_access_ebda+0x60/0xa0e
 [<c0f17f64>] ibmphp_init+0xb5/0x360
 [<c0101057>] do_one_initcall+0x57/0x138
 [<c0f17eaf>] ? ibmphp_init+0x0/0x360
 [<c0156158>] ? trace_hardirqs_on+0xb/0xd
 [<c0148d75>] ? __queue_work+0x2b/0x30
 [<c0f17eaf>] ? ibmphp_init+0x0/0x360
 [<c0f015a0>] kernel_init+0x17b/0x1e2
 [<c0f01425>] ? kernel_init+0x0/0x1e2
 [<c01178b3>] kernel_thread_helper+0x7/0x10
 =======================
---[ end trace a7919e7f17c0a725 ]---
initcall ibmphp_init+0x0/0x360 returned -19 after 144 msecs
calling  zt5550_init+0x0/0x6a @ 1

mapping the EBDA is rather ... un-nice from that driver, so i guess you 
check does the right thing in flagging possible crap.

	Ingo
--

From: Ingo Molnar
Date: Friday, September 26, 2008 - 2:46 am

it does:

    addr:  0x9f800
     end:  0xaf5e7
p->start:  0x9f800
  p->end:  0x9ffff

resources are laid out like this:

  0009f800-0009ffff : reserved
  000a0000-000bffff : Video RAM area

so the driver over-maps into the Video RAM...

and drivers/pci/hotplug/ibmphp_ebda.c seems to be under the 
misunderstanding that the EBDA is up to 65000 bytes large:

        io_mem = ioremap (ebda_seg<<4, 65000);

in reality the EBDA is at most 4K on a normal PC. So i think the right 
fix is the patch below - crop the range to 4K.

_Maybe_ we could remap io_mem to 64K window once we detected a RIO 
signature - but looking at the bogus 65000 number above i think it was 
just added in randomly as a "should be enough, doesnt cause problems" 
thing.

	Ingo

---------------------->
From f14478b953f8c8b84c868ae68d04722165622cf5 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Fri, 26 Sep 2008 11:40:53 +0200
Subject: [PATCH] x86, pci-hotplug, calgary / rio: fix EBDA ioremap()

IO resource and ioremap debugging uncovered this ioremap() done
by drivers/pci/hotplug/ibmphp_ebda.c:

initcall pci_hotplug_init+0x0/0x41 returned 0 after 3 msecs
calling  ibmphp_init+0x0/0x360 @ 1
ibmphpd: IBM Hot Plug PCI Controller Driver version: 0.6
resource map sanity check conflict: 0x9f800 0xaf5e7 0x9f800 0x9ffff reserved
------------[ cut here ]------------
WARNING: at arch/x86/mm/ioremap.c:175 __ioremap_caller+0x5c/0x226()
Pid: 1, comm: swapper Not tainted 2.6.27-rc7-tip-00914-g347b10f-dirty #36038
 [<c013a72d>] warn_on_slowpath+0x41/0x68
 [<c0156f00>] ? __lock_acquire+0x9ba/0xa7f
 [<c012158c>] ? do_flush_tlb_all+0x0/0x59
 [<c015ac31>] ? smp_call_function_mask+0x74/0x17d
 [<c012158c>] ? do_flush_tlb_all+0x0/0x59
 [<c013b228>] ? printk+0x1a/0x1c
 [<c013f302>] ? iomem_map_sanity_check+0x82/0x8c
 [<c0a773e8>] ? _read_unlock+0x22/0x25
 [<c013f302>] ? iomem_map_sanity_check+0x82/0x8c
 [<c0154e17>] ? trace_hardirqs_off+0xb/0xd
 [<c0127731>] __ioremap_caller+0x5c/0x226
 ...
From: Arjan van de Ven
Date: Friday, September 26, 2008 - 4:14 am

... and makes that uncachable (while it probably was WC before)

--

From: Ingo Molnar
Date: Saturday, September 27, 2008 - 9:35 am

[Empty message]
From: Jeremy Fitzhardinge
Date: Saturday, September 27, 2008 - 12:16 am

Any attempt to use ioremap on memory is a bug, so you should warn about

    J
--

From: Alan Cox
Date: Saturday, September 27, 2008 - 4:21 am

We use it to ioremap things like the BIOS...

--

From: Jeremy Fitzhardinge
Date: Saturday, September 27, 2008 - 7:43 am

Yeah, that's fine.  I mean using ioremap on system memory is a bug.

    J
--

From: Arjan van de Ven
Date: Saturday, September 27, 2008 - 8:09 am

From: Jeremy Fitzhardinge
Date: Saturday, September 27, 2008 - 9:17 am

OK, good.

    J

--

From: Alan Cox
Date: Saturday, September 27, 2008 - 9:25 am

On Sat, 27 Sep 2008 07:43:55 -0700

On system memory mapped by the kernel which I assume is what you mean -
the BIOS is often shadowed system memory and we have platforms where
memory pages not mapped into or managed the OS are ioremap() targets.

That however is getting pedantic - I just didn't want anyone to overdo
the sanity checks

Alan
--

From: Arjan van de Ven
Date: Saturday, September 27, 2008 - 12:24 pm

the existing sanity check checks to see if the kernel thinks if the memory is usable
for its own use, eg it uses the e820 table, the same one we use for feeding all memory
into the page allocator.

reserved and other similar types is not what ioremap complains about
--

Previous thread: Re: [PATCH 0/2] Traffic control cgroups subsystem by gauri@etri.re.kr on Thursday, September 25, 2008 - 6:07 pm. (1 message)

Next thread: e1000e NVM corruption issue status by Brandeburg, Jesse on Thursday, September 25, 2008 - 6:50 pm. (39 messages)