Hi,
Noticed this problem a bit late :-(
6fee48cd330c68332f9712bc968d934a1a84a32a broke
pci_set_consistent_dma_mask() on IXP4xx and most probably PXA. Affected
devices are e.g. IDE controller (CS5536-based: disk inaccessible) and
e1000 ethernet ("Detected Tx Unit Hang").
The attached patch makes it work again, though I'm not sure it's the
best solution.
Ideas?
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index ca32ed7..bd2a7d3 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -129,6 +129,14 @@ static inline u64 dma_get_mask(struct device *dev)
static inline int dma_set_coherent_mask(struct device *dev, u64 mask)
{
+#ifdef CONFIG_DMABOUNCE
+ if (dev->archdata.dmabounce) {
+ if (mask >= ISA_DMA_THRESHOLD)
+ return 0;
+ else
+ return -EIO;
+ }
+#endif
if (!dma_supported(dev, mask))
return -EIO;
dev->coherent_dma_mask = mask;
--
On Tue, 10 Aug 2010 22:36:21 +0200 I think that we should avoid adding "#ifdef CONFIG_DMABOUNCE" to a generic place. --
It doesn't break dmabounce. What it breaks is the fact that a PCI device which can do 32-bit DMA is connected to a PCI bus which can only access the first 64MB of memory through the host bridge, but the system has more than 64MB available. Allowing a 32-bit DMA mask means that dmabounce can't detect that memory above 64MB needs to be bounced to memory below the 64MB boundary. --
On Wed, 11 Aug 2010 08:25:32 +0100 But dmabounce doesn't look at dev->coherent_dma_mask. The change breaks __dma_alloc_buffer()? If we set dev->coherent_dma_mask to DMA_BIT_MASK(32) for ixp4xx's pci devices, __dma_alloc_buffer() doesn't use GFP_DMA. --
With an incorrect coherent_dma_mask, dma_alloc_coherent() will return memory outside of the 64MB window. This means that when dmabounce comes to allocate the replacement buffer, it gets a buffer which won't be accessible to the DMA controller - which is a condition it doesn't check for. Ergo, it breaks dmabounce. --
On Fri, 13 Aug 2010 22:54:13 +0100 Really? looks like dmabounce does nothing for coherent memory that dma_alloc_coherent() allocates. The following very hacky patch works? Or we could introduce something like ARCH_HAS_DMA_SET_COHERENT_MASK to let architectures to have dma_set_coherent_mask. A long solution would be having two dma_mask for a device and a bus. We also need something to represent a DMA-capable range instead of the dma mask. diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index c704eed..2a3fc2e 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -77,6 +77,11 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf if (mask < 0xffffffffULL) gfp |= GFP_DMA; +#ifdef CONFIG_DMABOUNCE + if (dev->archdata.dmabounce) + gfp |= GFP_DMA; +#endif + page = alloc_pages(gfp, order); if (!page) return NULL; --
So what happens if you use a driver which uses dma_alloc_coherent() directly? Should the driver really be passed memory which is inaccessible to the device because its outside the host bridge PCI window? No, this is clearly wrong, so this patch doesn't fix anything. It's a bodge, nothing more. The real solution is to have _both_ masks both reduced down according to the host bridge, as we used to do. So I suggest 6fee48c is reverted so that these platforms don't regress for -rc1. As I said when you sent the originally patch, it _looked_ okay, but I don't have any way to test it. It seems from testing (which unfortunately only seems to only happen after patches hit mainline) that it is not okay after all. --
On Sat, 14 Aug 2010 19:46:05 +0100 I'm not sure what you mean. A driver which uses dma_alloc_coherent() directly should work. dma_alloc_coherent() allocates memory with GFP_DMA with that patch for dmabounce devices. So the driver gets the access-able memory. The memory that dma_alloc_coherent() returns should be always consistent. We can't bounce it. All we can do is returning a memory that a device (and its bus) can access to. Sorry, the original ARM code is wrong. dev->dma_mask and dev->coherent_dma_mask represent the driver of dma restriction. ARM tries to use them for both a driver and a bus so ARM needs a workaround that doesn't set the driver of dma restriction to dev->dma_mask and dev->coherent_dma_mask. The proper solution is having a separate dma mask for a bus. I think that POWERPC already does the similar. It has max_direct_dma_addr in struct dev_archdata, which represents the dma restriction of a bus. --
Why bother when we both agree that the patch is a dirty hack? Come up with something cleaner first. --
On Sun, 15 Aug 2010 09:23:28 +0100 Because this fix needs to go to stable trees too. A simple patch is better even if it's hacky. For example, we can unify dma_needs_bounce functions in arm with a clean solution, I think. But dma_needs_bounce() was changed after 2.6.35 so it would be difficult to backport a clean solution. btw, will we have more like this case? If so, I think that it's worth having a generic solution for this case instead of having the arch (arm and powerpc) specific solution. --
Hi, This patch fixes the problem on my IXP425. -- Krzysztof Halasa --
On Tue, 17 Aug 2010 01:29:49 +0200 Thanks a lot! I'll re-send the patch in the proper format. Can you send it to mainline for 2.6.36? I'll work on the proper solution for this issue for 2.6.37. --
Guess that's a question for Russell. -- Krzysztof Halasa --
I'd rather have the arch (aka the bus) be able to filter the mask, better than having to deal with multiple masks in the generic code. Besides, in embedded-land, you never know how many busses are stacked before you reach the device, ie, you'd end up having to AND quite a few masks before getting there in some cases. Sounds better to establish that once, at set_coherent_dma_mask() time. Cheers, Ben. --
On Thu, 19 Aug 2010 20:31:22 +1000 You mean that you like to permit architectures to modify dev->coherent_dma_mask behind a device? If so, I'm against it because it means dev->coherent_dma_mask has two meanings. That's confusing. I don't plan to have the generic code to deal with multiple masks. I thought about simply moving max_direct_dma_addr in POWERPC's dev_archdata to a generic place (possibly, struct device_dma_parameters). I think that having the generic place for bus' dma mask would be better rather than architecture specific As long as dev->coherent_dma_mask represents the same thing on every architecture, permitting architectures to have the own dma_set_coherent_mask() is fine by me. I like to avoid it if possible though. --
Well, I think it may be the only really correct solution, and in fact it's arch-independent. The coherent_dma_mask would mean one thing: address space shared between the CPU(s) and the device. This usually equals device's address space - only because CPU and bridges next to it have wide (logical) address busses. It's not always the case, though, and may be not the case on any arch. We should make sure we got it right (including drivers), since any reduction of the dma*mask would be irreversible (new masks would be Definitely, if possible. BTW the dmabounce (and equivalent code on other archs, including probably swiotlb on x86-64) could probably be merged as well. I don't know the internals very well, though. At least it may be Not sure. Which bus? There could be many :-) In practice - 64-bit PCIe -> 32-bit PCI -> 24-bit ISA - etc. That would be ideal. Buses work on all archs the same after all. -- Krzysztof Halasa --
On Thu, 19 Aug 2010 18:53:38 +0200 linux/device.h defines that it's the device dma mask. Except for ARM, coherent_dma_mask represents the device dma mask. We sometimes want to know the device dma mask. Moving to your btw, swiotlb is used by x86_64, ia64, and powerpc. I'm sure that I can convert ARM to use it instead of dmabounce. But well, even a bug fix patch for dmabounce haven't been merged so I'm not sure that ARM people would accept such change. Seems that we are not on the same page. As I said before, have you seen how POWERPC uses max_direct_dma_addr in dev_archdata struct? I was talking about moving it to the generic place and the API to set it. --
That's just verbiage in a header file. Nobody cares. The point is, the device DMA mask per-se is a completely useless piece of information, and thus there is absolutely no point keeping it around there. The only thing that matters is the intersection of all the masks on the way to memory, which defines where memory can be allocated. Now the mask thing itself might end up not being enough for ARM in the long run, I wouldn't be surprised if we end up with busses that can DMA to areas of memory that are not 0 based, but that's a discussion for When ? Cheers, --
No it's not. It has one and only one meaning which is the mask defining where the coherent memory can come from for that device. Nobody cares if the device can do more and has been "clipped" at set_coherent_dma_mask() time by the bus. This is not useful information. So I beleive the arch should hook the later and modify the mask as it Well, max_direct_dma_addr used on powerpc is a bit nasty because it doesn't necessarily represent a power of 2, so a mask is no good here Ben. --
On Fri, 20 Aug 2010 07:51:52 +1000 Ok. I've seen that someone submitted a patch to print the dma_mask under sysfs, I supposed, for debugging to check if a driver misuses the DMA API or the bus can't do 64bit DMA when the device can support 64bit DMA but only gets a buffer under 32bit. But yeah, we can live withtout it. Lots of drivers call dma_set_coherent_mask with 64bit mask and then call it with 32bit mask if 64bit mask fails. So we don't have driver's OK. like dma_set_mask(), we could make every architecutre have the own dma_set_coherent_mask(). But looks like only ARM needs own dma_set_coherent_mask() (at least now), so adding ARCH_HAS_DMA_SET_COHERENT_MASK define might be better. I don't like the asymmetry of dma_set_mask and dma_set_coherent_mask much though. --
On Thu, 26 Aug 2010 20:55:09 +0900
btw, I'm still not sure, letting architectures to clip the dma mask
(and coherent mask) behind a driver is correct by defintion of the DMA
API (it's not a real problem).
DMA-API.txt defines dma_set_mask is "checks to see if the mask is
possible and update the device parameters if it is". It means that
architectures can't clip the mask behind a driver, I think.
Lots of drivers do something like:
if (dma_set_mask(dev, DMA_BIT_MASK(64)))
if (dma_set_mask(dev, DMA_BIT_MASK(32)))
What arm does is accepting whatever dma mask and setting the clipped
mask behind a driver. If we use this sementics (archs are free to clip
the mask), drivers don't need the second dma_set_mask call. And the
driver wrongly assumes that it successfully set 64bit dma mask (and
possibly set the hardware to 64bit dma mode needlessly).
--
Ok, in that case lets disable all PCI drivers which do this on IXP4xx then, because they obviously can't cope with the 64MB window that this platform has. Clearly they need to be rewritten such that they can cope with this, irrespective of the fact that they've worked for ages with the current solution. --
On Thu, 26 Aug 2010 18:57:19 +0100 I didn't insist such (I wrote, "it's not a real proble"). As I wrote in another mail, we could make every architecutre have the own dma_set_coherent_mask(). It's fine by me. I simply wanted to know your opinions: Looks like that the DMA API doesn't expect architectures to clip the mask. It might be better to add the new API to set the dma mask in the long term? (i.e. calling dma_set_mask twice is the best interface?) It's better to forget it since it's only about one architecture (too much change provides little benefit). --
Which seems strange to me. If the driver asks for 64-bit mask and the system can only give it 32-bits, why return an error? Every 32-bit address is also 64-bit, with the most significant bits simply cleared. It makes sense the other way around, if the device wants e.g. 24-bit mask (ISA or something) but the OS doesn't have a memory pool smaller than e.g. 4 GB then returning with error is ok. Same with IXP4xx (and perhaps PXA) - the device wants 32 bits and it's fine, even if the mask is set to 28 bits since the CPU can't let PCI access more. If the device wants to know if it should issue DACs or something like that, sure - just check the current masks (effective). -- Krzysztof Halasa --
On Thu, 26 Aug 2010 18:02:46 +0200 As I wrote, the DMA API simply wasn't designed in that way (let architectures to clip the mask), I guess. If it was, dma_set_coherent_mask might return the actual mask. --
