53c700 is the only user of dma_is_consistent(): BUG_ON(!dma_is_consistent(hostdata->dev, pScript) && L1_CACHE_BYTES < dma_get_cache_alignment()); The above code tries to see if the system can allocate coherent memory or not. It's for some old systems that can't allocate coherent memory at all (e.g some parisc systems). I think that we can safely remove the above usage: - such old systems haven't triger the above checking for long. - the above condition is important for systems that can't allocate coherent memory if these systems do DMA. So probably it would be better to have such checking in arch's DMA initialization code instead of a driver. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> --- drivers/scsi/53c700.c | 3 --- 1 files changed, 0 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c index 80dc3ac..89fc1c8 100644 --- a/drivers/scsi/53c700.c +++ b/drivers/scsi/53c700.c @@ -309,9 +309,6 @@ NCR_700_detect(struct scsi_host_template *tpnt, hostdata->msgin = memory + MSGIN_OFFSET; hostdata->msgout = memory + MSGOUT_OFFSET; hostdata->status = memory + STATUS_OFFSET; - /* all of these offsets are L1_CACHE_BYTES separated. It is fatal - * if this isn't sufficient separation to avoid dma flushing issues */ - BUG_ON(!dma_is_consistent(hostdata->dev, pScript) && L1_CACHE_BYTES < dma_get_cache_alignment()); hostdata->slots = (struct NCR_700_command_slot *)(memory + SLOTS_OFFSET); hostdata->dev = dev; -- 1.6.5 --
The definition of dma_is_consistent() isn't equal in architectures. So it hasn't been so useful for drivers (we have only one user of the API in tree). Even if we fix dma_is_consistent() in some architectures, it doesn't look useful at all. It was invented long ago for some old systems that can't allocate coherent memory at all. It's better to export only APIs that are definitely necessary for drivers. Let's remove this API. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> --- Documentation/DMA-API.txt | 6 ------ arch/alpha/include/asm/dma-mapping.h | 1 - arch/arm/include/asm/dma-mapping.h | 5 ----- arch/avr32/include/asm/dma-mapping.h | 5 ----- arch/blackfin/include/asm/dma-mapping.h | 1 - arch/cris/include/asm/dma-mapping.h | 2 -- arch/frv/include/asm/dma-mapping.h | 2 -- arch/ia64/include/asm/dma-mapping.h | 2 -- arch/m68k/include/asm/dma-mapping.h | 5 ----- arch/microblaze/include/asm/dma-mapping.h | 1 - arch/mips/include/asm/dma-mapping.h | 2 -- arch/mips/mm/dma-default.c | 7 ------- arch/mn10300/include/asm/dma-mapping.h | 2 -- arch/parisc/include/asm/dma-mapping.h | 6 ------ arch/powerpc/include/asm/dma-mapping.h | 5 ----- arch/sh/include/asm/dma-mapping.h | 6 ------ arch/sparc/include/asm/dma-mapping.h | 1 - arch/um/include/asm/dma-mapping.h | 1 - arch/x86/include/asm/dma-mapping.h | 1 - arch/xtensa/include/asm/dma-mapping.h | 2 -- include/asm-generic/dma-mapping-broken.h | 3 --- 21 files changed, 0 insertions(+), 66 deletions(-) diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index 05e2ae2..fe23269 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -456,12 +456,6 @@ be identical to those passed in (and returned by dma_alloc_noncoherent()). int -dma_is_consistent(struct device *dev, dma_addr_t ...
Actually, that's not the right explanation. The BUG_ON is because of an efficiency in the driver ... it's nothing to do with the architecture. The driver uses a set of mailboxes, but for efficiency's sake, it packs them into a single coherent area and separates the different usages by a L1 cache stride). On architectures capable of manufacturing coherent memory, this is a nice speed up in the DMA infrastructure. However, for incoherent architectures, it's fatal if the dma coherence stride is greater than the L1 cache size, because now we'll get data corruption Well, we can't check in the architecture because it's a driver specific thing ... I suppose making it a rule that dma_get_cache_alignment() *must* be <= L1_CACHE_BYTES fixes it ... we seem to have no architecture violating that, so just add it to the documentation, and the check can go. James --
On Sun, 27 Jun 2010 10:08:48 -0500
Sorry, I should have looked the details of the driver.
You are talking about the following tricks, right?
#define MSG_ARRAY_SIZE 8
#define MSGOUT_OFFSET (L1_CACHE_ALIGN(sizeof(SCRIPT)))
__u8 *msgout;
#define MSGIN_OFFSET (MSGOUT_OFFSET + L1_CACHE_ALIGN(MSG_ARRAY_SIZE))
__u8 *msgin;
#define STATUS_OFFSET (MSGIN_OFFSET + L1_CACHE_ALIGN(MSG_ARRAY_SIZE))
__u8 *status;
#define SLOTS_OFFSET (STATUS_OFFSET + L1_CACHE_ALIGN(MSG_ARRAY_SIZE))
struct NCR_700_command_slot *slots;
Seems that on some architectures (arm and mips at least),
dma_get_cache_alignment() could greater than L1_CACHE_BYTES. But they
simply return the possible maximum size of cache size like:
static inline int dma_get_cache_alignment(void)
{
/* XXX Largest on any MIPS */
return 128;
}
So practically, we should be safe. I guess that we can simply convert
them to return L1_CACHE_BYTES.
Some PARISC and mips are only the fully non-coherent architectures
that we support now? We can remove the above checking if
dma_get_cache_alignment() is <= L1_CACHE_BYTES on PARISC and mips?
--
Yes, that's it. The mailboxes themselves are pretty small, and the minimum coherent allocation is usually a page, so we'd waste orders of magnitude more coherent memory than we actually need without this trick As long as that's architecturally true, yes. I mean I can't imagine any architecture that had a dma alignment requirement that was greater than its L1 cache width ... but I've been surprised be for making "Obviously this can't happen ..." type statements where MIPS is I think there might be some ARM SoC systems as well ... there were some strange ones that had tight limits on the addresses the other SoC components could DMA to which made it very difficult to make consistent I don't think we need to check, just document that dma_get_cache_alignment cannot be greater than the L1 cache stride. James --
We had similar cases on SH where even though we can generally provide consistent memory, it may not be visible or usable by certain peripherals. On some of the earlier CPUs when the on-chip bus was being overhauled there was an on-chip DMAC and a PCI DMAC on different interconnects with their own addressing limitations. PCI DMA needed buffers to be allocated from PCI space and would simply generate address errors for anything on any of the other interconnects. On those systems we could provide consistent memory for other PCI devices if and only if we happened to have a PCI video card or something else with spare device memory on the bus inserted -- which in turn would not be visible on any other interconnects. In those cases it worked out that the DMA alignment for PCI memory and L1 line size were the same, but that was really more by coincidence than design. There are still similar cases remaining. Most SH CPUs have a snoop controller for snooping PCI <-> external memory transactions, but most CPUs do not enable it on account of not being able to have the CPU enter idle states while the controller is active. It's only been with the SMP parts that a generic snoop controller has been provided that has general Looking at the MIPS stuff, it also seems like there are cases where L1_CACHE_BYTES == 32 while the kmalloc minalign value is bumped to 128 for certain CPU configurations, and kept at 32 for others. Those sorts of values look a lot more like the L2 cache stride than the L1, perhaps something to do with the snoop controller on exotic ccNUMA configurations? --
On Mon, 28 Jun 2010 09:55:58 -0500 How about using ARCH_KMALLOC_MINALIGN instead of L1_CACHE_BYTES? In the previous merge window, we made sure that all the architectures defines the minimum alignment and width of DMA properly (and the fully coherent architectures don't define ARCH_KMALLOC_MINALIGN). dma_get_cache_alignment should be equal to ARCH_KMALLOC_MINALIGN if an architecture defines ARCH_KMALLOC_MINALIGN (probably, dma_get_cache_alignment() can be implemented in the common place with ARCH_KMALLOC_MINALIGN. It would be better to rename ARCH_KMALLOC_MINALIGN to something like ARCH_DMA_MINALIGN). It might be better to place DMA_ALIGN(x) in the common place. Seems that some drivers wrongly use L1_CACHE_ALIGN() to get the dma alignment. Well, using cache alignment magic in drivers isn't a good idea though... = From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Subject: [PATCH] 53c700: remove dma_is_consistent usage in 53c700 ARCH_KMALLOC_MINALIGN returns the minimum alignment and width of DMA on architectures that define ARCH_KMALLOC_MINALIGN (if it's not defined, architectures are fully coherent). So we can use ARCH_KMALLOC_MINALIGN instead of L1_CACHE_BYTES and safely remove the alignment checking. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> --- drivers/scsi/53c700.c | 1 - drivers/scsi/53c700.h | 17 ++++++++++++----- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c index 80dc3ac..f5fd923 100644 --- a/drivers/scsi/53c700.c +++ b/drivers/scsi/53c700.c @@ -311,7 +311,6 @@ NCR_700_detect(struct scsi_host_template *tpnt, hostdata->status = memory + STATUS_OFFSET; /* all of these offsets are L1_CACHE_BYTES separated. It is fatal * if this isn't sufficient separation to avoid dma flushing issues */ - BUG_ON(!dma_is_consistent(hostdata->dev, pScript) && L1_CACHE_BYTES < dma_get_cache_alignment()); hostdata->slots = (struct NCR_700_command_slot *)(memory + ...
Actually, I'd rather not do this. The reason is that L1_CACHE_ALIGN is quite a big performance optimisation on x86 for the driver. Without it, it's functionally correct, but the DMA use of the mailboxes really thrashes the cache which damages performance (x86 has ARCH_KMALLOC_MINALIGN set to 8 ... the default) The only correctness problem, which the BUG is checking for is mismatch in dma alignment ... as I said, I'm happy just to rely on that being correct on every incoherent platform the driver operates on. James --
On Tue, 29 Jun 2010 08:37:35 -0500 Ah, I see. If slab.h doesn't define ARCH_KMALLOC_MINALIGN for architectures that don't define it, the driver could do something like: #ifdef ARCH_KMALLOC_MINALIGN #define DMA_ALIGN(x) ALIGN(x, ARCH_KMALLOC_MINALIGN) #else #define DMA_ALIGN(x) ALIGN(x, L1_CACHE_BYTES) #endif Seems that it's better to rename ARCH_KMALLOC_MINALIGN to something like ARCH_DMA_MINALIGN and make ARCH_KMALLOC_MINALIGN the slab Ok, it's fine by me too. let's simply remove the BUG_ON. I think that you want to document that dma_get_cache_alignment() cannot be greater than the L1 cache stride. However, seems that dma_get_cache_alignment() is greater than L1_CACHE_BYTES on some architectures (they have some reasons, I assume). So I'll just remove the BUG_ON. --
