Re: [PATCH] pci: introduce an ioremap_pcibar(pdev, barnr) function

Previous thread: [PATCH] Tainted cleanup by Alexey Dobriyan on Friday, September 26, 2008 - 4:20 pm. (1 message)

Next thread: Use CPUID to communicate with the hypervisor. by Alok Kataria on Friday, September 26, 2008 - 4:46 pm. (49 messages)
From: Arjan van de Ven
Date: Friday, September 26, 2008 - 4:36 pm

From abb4ec6b5b7f22904668fb5de5fe0d7a3a7fd2d5 Mon Sep 17 00:00:00 2001
From: Arjan van de Ven <arjan@linux.intel.com>
Date: Fri, 26 Sep 2008 16:34:52 -0700
Subject: [PATCH] pci: introduce an ioremap_pcibar(pdev, barnr) function

A common thing in many PCI drivers is to ioremap() an entire bar.
This is a slightly fragile thing right now, needing both an address and a size,
and many driver writers do.. various things there.

This patch introduces an ioremap_pcibar() function taking just a PCI device struct
and the bar number as arguments, and figures this all out itself, in one place.
In addition, we can add various sanity checks to this function (the patch already
checks to make sure that the bar in question really is a MEM bar; few to no drivers
do that sort of thing).

Hopefully with this type of API we get less chance of mistakes in drivers with
ioremap() operations.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
 include/linux/pci.h |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index e48614b..65c1dbc 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1122,5 +1122,18 @@ static inline void pci_mmcfg_early_init(void) { }
 static inline void pci_mmcfg_late_init(void) { }
 #endif
 
+static inline void * ioremap_pcibar(struct pci_dev *pdev, int bar)
+{
+	/*
+	 * Make sure the BAR is actually a memory resource, not an IO resource
+	 */
+	if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) {
+		WARN_ON(1);
+		return NULL;
+	}
+	return ioremap_nocache(pci_resource_start(pdev, bar),
+				     pci_resource_len(pdev, bar));
+}
+
 #endif /* __KERNEL__ */
 #endif /* LINUX_PCI_H */
-- 
1.5.5.1



-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Arjan van de Ven
Date: Friday, September 26, 2008 - 4:37 pm

From b94333809306543604f8bf23da6dddca3efac451 Mon Sep 17 00:00:00 2001
From: Arjan van de Ven <arjan@linux.intel.com>
Date: Fri, 26 Sep 2008 16:36:00 -0700
Subject: [PATCH] pci: introduce users of ioremap_pcibar()

one big patch for now; I'll split this up after the first review round to avoid
to have to redo that many times
---
 drivers/block/sx8.c                        |    3 +--
 drivers/edac/i82875p_edac.c                |    4 +---
 drivers/i2c/busses/i2c-hydra.c             |    2 +-
 drivers/ide/pci/sgiioc4.c                  |    2 +-
 drivers/media/common/saa7146_core.c        |    3 +--
 drivers/media/video/cx23885/cx23885-core.c |    3 +--
 drivers/media/video/cx88/cx88-cards.c      |    3 +--
 drivers/mfd/sm501.c                        |    3 +--
 drivers/misc/ibmasm/module.c               |    3 +--
 drivers/misc/tifm_7xx1.c                   |    3 +--
 drivers/mmc/host/sdhci-pci.c               |    2 +-
 drivers/mtd/maps/pci.c                     |    3 +--
 drivers/net/bnx2x_main.c                   |    3 +--
 drivers/net/e1000/e1000_main.c             |    7 ++-----
 drivers/net/epic100.c                      |    2 +-
 drivers/net/ixgb/ixgb_main.c               |    3 +--
 drivers/net/qla3xxx.c                      |    4 +---
 drivers/net/s2io.c                         |    6 ++----
 drivers/net/wan/dscc4.c                    |    3 +--
 drivers/net/wan/pc300too.c                 |    2 +-
 drivers/net/wan/pci200syn.c                |    2 +-
 drivers/net/wireless/hostap/hostap_pci.c   |    2 +-
 drivers/net/wireless/ipw2200.c             |    2 +-
 drivers/net/wireless/rt2x00/rt2x00pci.c    |    3 +--
 drivers/pci/hotplug/cpcihp_zt5550.c        |    3 +--
 drivers/pci/hotplug/cpqphp_core.c          |    3 +--
 drivers/scsi/advansys.c                    |    3 +--
 drivers/scsi/arcmsr/arcmsr_hba.c           |    8 +++-----
 drivers/scsi/ipr.c                         |    2 +-
 drivers/scsi/nsp32.c                       |    3 +--
 ...
From: Grant Grundler
Date: Monday, September 29, 2008 - 12:26 am

*nod*

FTR, I like this interface better since most drivers use ioremap() instead

Should the comment be deleted?
It made sense (sort of) for the orginal code and maybe including
a comment like it in the new ioremap_pcibar() would be good.

Alternatively, the ioremap_pcibar() code needs to check for

Is there any easy way to tell if the device driver should be using
uncached mappings vs cacheable mappings?
(Just from looking at the source code)

This patch changes that behavior of the device driver so it uses
uncacheable instead of cacheable mappings.  This is the only thing
I'm uncertain about for this patch.

For storage/networking/audio cards I'm comfortable with the generalization
that they all should use uncacheable mappings (I'm sure there are some
exceptions.) I'm not with video devices.

I'm pretty sure this patch won't break anything. Good chance it will
fix some existing drivers on platforms that really do map the BAR cacheable.
But it might hurt performance on a few devices that really intended
to use cacheable mappings for stuff like write coalescing.


And I have a second issue less important issue.
What is the result of ioremap_pcibar(pci, 1) when BAR0 is a 64-bit bar?
Given the name, I expect to call "ioremap_pcibar(pci,2)" to get the
desired result.  Maybe just document how to handle this correctly
in Documentation/pci.txt would be sufficient.

hth,
--

From: Alan Cox
Date: Monday, September 29, 2008 - 2:20 am

And pci_iomap is cleaner still. Plus pcim_iomap does resource tracking so


There are exceptions - I2O for example and there are other cards that use
write merging when possible beyond video. Also btw vidoe depends on the
chip - if you cache/write merge the framebuffer on a Voodoo 1/2 card you
must be in 24/32bit video modes for example.

Generally though PCI = controlled by hardware = uncached

Alan
--

From: Arjan van de Ven
Date: Monday, September 29, 2008 - 6:42 am

On Mon, 29 Sep 2008 01:26:43 -0600



we should detect this and DTRT inside the implementation, not in the
drivers.

-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Grant Grundler
Date: Monday, September 29, 2008 - 10:10 am

On Mon, Sep 29, 2008 at 06:42:20AM -0700, Arjan van de Ven wrote:

Ok...it was using cacheable mapping on ia64 until this commit in 2007:
    http://www.gelato.unsw.edu.au/archives/linux-ia64/0703/20211.html

I stopped paying close attention  on ia64 in 2006 for the most part.
It's always been uncacheable on parisc.

After finding willy's patch (March 2006) on lwn.net, I remember the
discussion around changing the behavior of ioremap() to be uncached:
    http://lwn.net/Articles/178084/

And I have to agree with willy/alan, pci_iomap() is already doing this.

However, pci_iomap() isn't quite right either:
        if (flags & IORESOURCE_MEM) {
                if (flags & IORESOURCE_CACHEABLE)
                        return ioremap(start, len);
                return ioremap_nocache(start, len);
        }

I expect it needs to use ioremap_cache() instead of ioremap().
One line patch below fixes that. Build-tested on x86 only.


pci_iomap() is already doing this. See lib/iomap.c:pci_iomap().

thanks,
grant


diff --git a/lib/iomap.c b/lib/iomap.c
index d322293..5565cf9 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -267,7 +267,7 @@ void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
 		return ioport_map(start, len);
 	if (flags & IORESOURCE_MEM) {
 		if (flags & IORESOURCE_CACHEABLE)
-			return ioremap(start, len);
+			return ioremap_cache(start, len);
 		return ioremap_nocache(start, len);
 	}
 	/* What? */
--

From: Arjan van de Ven
Date: Monday, September 29, 2008 - 10:23 am

On Mon, 29 Sep 2008 11:10:49 -0600

pci_iomap() does "stuff" but it assumes you're using the iomap APIs
across the driver. MANY don't.

And pci_iomap() takes more parameters than most driver writers want or
need. Most of the time it's "I want the whole bar"; even if my patch
wraps around that, making the API simpler is still worth it imo
--

From: Grant Grundler
Date: Tuesday, September 30, 2008 - 10:24 pm

pci_iomap() returns a "void __iomem *".
readl/writel take "void __iomem *" as an argument.
See build_mmio_read() in include/asm-x86/io.h

I think the assumption is the other way around: use of ioread/iowrite
assumes use of io_remap(). pci_iomap is the PCI wrapper around io_remap().
You just want a simpler wrapper (and I agree, it really could without
the extra arg).

But in any case, we can document pci_iomap() to be whatever you think
we should be exporting.  pci_iomap() is not currently documented in

You are right about that.
Would calling the API "pci_iomap_bar()" to keep the naming consistent help
make it more acceptable?

(And adding documentation for both would be good too...I can do
that if the new API gets accepted.)

hth,
grant
--

From: Arjan van de Ven
Date: Tuesday, September 30, 2008 - 3:30 pm

On Tue, 30 Sep 2008 23:24:50 -0600

I'm fine with pci_iomap_bar()... it meets my goals
Would be nice if I'd be allowed to make it only work on MEM bars not IO
bars (so that drivers don't accidentally end up calling this on an IO
bar and then using readl() etc)



-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Rolf Eike Beer
Date: Wednesday, October 1, 2008 - 3:33 am

IIRC pci_iomap() was documented to work on both at it does "the right thing=
"=20
automatically.

Eike
From: Kyle McMartin
Date: Wednesday, October 1, 2008 - 5:42 am

If they use the iomap interface they shouldn't be using readl at all,
they should be using ioread*... It would be a bug otherwise.

cheers, Kyle
--

From: Matthew Wilcox
Date: Wednesday, October 1, 2008 - 5:57 am

That's a viewpoint I've heard several people espouse over the last few
days, but it's not (entirely) true.  Addresses returned from calling
iomap() on a memory location must be compatible with addresses returned
from calling ioremap(), so you can use readl() on an iomap address, as
long as you know that it was a memory address that was iomapped.

        if (flags & IORESOURCE_MEM) {
                if (flags & IORESOURCE_CACHEABLE)
                        return ioremap(start, len);
                return ioremap_nocache(start, len);
        }

OK, not all architectures use the generic code, but I've been through
and they all do more or less the above (mn10300 and frv just return the
address, but their readl() and inl() are identical)

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--

From: Kyle McMartin
Date: Wednesday, October 1, 2008 - 6:07 am

I don't recall anyone ever promising that the iomap interfaces would be
usable with legacy accessors. I'd certainly prefer it if we didn't, as
well, as it makes for more explicitly written drivers...

Just because you can use them, doesn't mean you should.

regards, Kyle
--

From: Arjan van de Ven
Date: Wednesday, October 1, 2008 - 6:53 am

On Wed, 1 Oct 2008 09:07:17 -0400

ok so now we're full circle.
I started with a real ioremap, was told "no must use iomap" despite the
same argument you make, and now we're back to square one.

What I want is an interface that can replace ioremap() for the common
"I want the bar uncached" case. Nothing more nothing less.... sigh.


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Matthew Wilcox
Date: Friday, September 26, 2008 - 7:56 pm

So we already have a pci_iomap() which takes a 'max' argument.  If you
make 'max' -1, don't you get this same behaviour?

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--

From: Arjan van de Ven
Date: Saturday, September 27, 2008 - 8:35 am

On Fri, 26 Sep 2008 20:56:14 -0600

there are many ways to map a bar... I'm just arguing for providing a
really simple, hard-to-get-wrong, one which only takes a device and a
bar number.
The goal of my patch isn't to introduce new functionality per se
(although having the ability to add checks is nice and welcome), it's
mostly to provide a simple, hard-to-get-wrong interface.
Less chance to get it wrong -> less hard to diagnose bugs.



-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

Previous thread: [PATCH] Tainted cleanup by Alexey Dobriyan on Friday, September 26, 2008 - 4:20 pm. (1 message)

Next thread: Use CPUID to communicate with the hypervisor. by Alok Kataria on Friday, September 26, 2008 - 4:46 pm. (49 messages)