Re: wrong usage of MAX_DMA_ADDRESS in bootmem.h

Previous thread: kernel.org missing .1 incremental patches by Andrew Lyon on Tuesday, September 30, 2008 - 11:51 am. (3 messages)

Next thread: [RFC patch 0/3] signals: add rt_tgsigqueueinfo syscall by Thomas Gleixner on Tuesday, September 30, 2008 - 12:48 pm. (6 messages)
From: Nicolas Pitre
Date: Tuesday, September 30, 2008 - 12:35 pm

I have implemented highmem for ARM.  To catch wrong usage of __pa() and 
__va() with out of range values, I added a range check when 
CONFIG_DEBUG_HIGHMEM is set.

One issue is that bootmem.h uses __pa(MAX_DMA_ADDRESS). However 
MAX_DMA_ADDRESS on ARM is defined as 0xffffffff because there is usually 
no restriction on the maximum DMA-able address.

RMK suggested that those places should be using ISA_DMA_THRESHOLD 

So what about this patch?

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index 95837bf..7a97ffe 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -96,21 +96,21 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat,
 				      unsigned long goal);
 #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE
 #define alloc_bootmem(x) \
-	__alloc_bootmem(x, SMP_CACHE_BYTES, __pa(MAX_DMA_ADDRESS))
+	__alloc_bootmem(x, SMP_CACHE_BYTES, ISA_DMA_THRESHOLD)
 #define alloc_bootmem_nopanic(x) \
-	__alloc_bootmem_nopanic(x, SMP_CACHE_BYTES, __pa(MAX_DMA_ADDRESS))
+	__alloc_bootmem_nopanic(x, SMP_CACHE_BYTES, ISA_DMA_THRESHOLD)
 #define alloc_bootmem_low(x) \
 	__alloc_bootmem_low(x, SMP_CACHE_BYTES, 0)
 #define alloc_bootmem_pages(x) \
-	__alloc_bootmem(x, PAGE_SIZE, __pa(MAX_DMA_ADDRESS))
+	__alloc_bootmem(x, PAGE_SIZE, ISA_DMA_THRESHOLD)
 #define alloc_bootmem_pages_nopanic(x) \
-	__alloc_bootmem_nopanic(x, PAGE_SIZE, __pa(MAX_DMA_ADDRESS))
+	__alloc_bootmem_nopanic(x, PAGE_SIZE, ISA_DMA_THRESHOLD)
 #define alloc_bootmem_low_pages(x) \
 	__alloc_bootmem_low(x, PAGE_SIZE, 0)
 #define alloc_bootmem_node(pgdat, x) \
-	__alloc_bootmem_node(pgdat, x, SMP_CACHE_BYTES, __pa(MAX_DMA_ADDRESS))
+	__alloc_bootmem_node(pgdat, x, SMP_CACHE_BYTES, ISA_DMA_THRESHOLD)
 #define alloc_bootmem_pages_node(pgdat, x) \
-	__alloc_bootmem_node(pgdat, x, PAGE_SIZE, __pa(MAX_DMA_ADDRESS))
+	__alloc_bootmem_node(pgdat, x, PAGE_SIZE, ISA_DMA_THRESHOLD)
 #define alloc_bootmem_low_pages_node(pgdat, x) \
 	__alloc_bootmem_low_node(pgdat, x, PAGE_SIZE, 0)
 #endif /* ...
From: Christoph Lameter
Date: Tuesday, September 30, 2008 - 12:56 pm

ok so do

#define MAX_DMA_ADDRESS ISA_DMA_THRESHOLD


MAX_DMA_ADDRESS is the highest address used for ZONE_DMA / GFP_DMA

Does ISA_DMA_THRESHOLD have any meaning on ARM? If you use old ISA stuff then
you need CONFIG_ZONE_DMA and therefore also MAX_DMA_ADDRESS.

If not then there is no need to define CONFIG_ZONE_DMA and MAX_DMA_ADDRESS
looses its usual meaning.
--

From: Russell King - ARM Linux
Date: Tuesday, September 30, 2008 - 1:12 pm

Not correct.  MAX_DMA_ADDRESS is a virtual address.  ISA_DMA_THRESHOLD
is the last byte of _physical_ memory which ISA DMA can transfer:

include/asm-x86/scatterlist.h:#define ISA_DMA_THRESHOLD (0x00ffffff)


Incorrect.  MAX_DMA_ADDRESS is the highest possible virtual DMA address:

include/asm-x86/dma.h:#define MAX_DMA_ADDRESS      (PAGE_OFFSET + 0x1000000)


As we have already covered in the past, CONFIG_ZONE_DMA has to always
be enabled on ARM because ARM always puts all memory in the first zone.
To do otherwise introduces lots of special cases, and I steadfastly
refuse to make the memory initialisation any more complicated than it
already is.

And besides, this has nothing to do with that issue.
--

From: Nicolas Pitre
Date: Tuesday, September 30, 2008 - 2:09 pm

I just tried this:

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 70dba16..8f609cc 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -148,7 +148,6 @@ config ARCH_MAY_HAVE_PC_FDC
 
 config ZONE_DMA
 	bool
-	default y
 
 config GENERIC_ISA_DMA
 	bool

with no other changes what so ever.  And the resulting kernel still 
works fine, with this difference:

|On node 0 totalpages: 131072
|free_area_init_node: node 0, pgdat c03c5e00, node_mem_map c03e7000
|  Normal zone: 130048 pages, LIFO batch:31

instead of:

|On node 0 totalpages: 131072
|free_area_init_node: node 0, pgdat c03c7e58, node_mem_map c03e9000
|  DMA zone: 130048 pages, LIFO batch:31

And the resulting kernel is also smaller:

|   text    data     bss     dec     hex filename
|3826182  102384  111700 4040266  3da64a vmlinux
|3823593  101616  111700 4036909  3d992d vmlinux.nodmazone

So maybe CONFIG_DMA_ZONE could be selected only by those machines 

Indeed.  But still...


Nicolas
--

From: Christoph Lameter
Date: Wednesday, October 1, 2008 - 5:07 am

Someone screwed around with the basics here. MAX_DMA_ADDRESS is no longer
related to MAX_DMA_PFN for the x86_32 case. What is the point of relating
MAX_DMA_ADDRESS to PAGE_OFFSET? Looks like we are creating more confusion
about the strange DMA zone.

The best would be to rename these variables to make the semantics clearer

ZONE_DMA related variables:

MAX_DMA_PFN -> MAX_ZONE_DMA_PFN
MAX_DMA_ADDRESS -> MAX_ZONE_DMA_ADDRESS

MAX_DMA32_PFN -> MAX_ZONE_DMA32_PFN
MAX_DMA32_ADDRESS -> MAX_ZONE_DMA32_ADDRESS

Then the general DMAability


MAX_DMA_ADDRESS is the highest possible address for the DMA zone. Not the
highest possible address that any DMA controller can use. And now we have
special casing that makes the semantics different between 32 bit and 64 bit

I believe we have been over this. If you just have one zone then the core code
would expect you to disable CONFIG_ZONE_DMA and have all memory treated equal
in ZONE_NORMAL.

The naming seems to be the problem here. Maybe renaming ZONE_DMA to
ZONE_RESTRICTED_DMA or something would help. We are currently creating two
different paradigms of using these constants.



--

From: Russell King - ARM Linux
Date: Wednesday, October 1, 2008 - 7:06 am

Because it is a virtual address.  It has to be.  You're using __pa() on it,

That's no clearer.  Are they physical addresses?  Or are they virtual

Semantically disagree.

If you only have a controller which can address 1MB of memory (yes, they do
exist) then MAX_DMA_ADDRESS must be PAGE_OFFSET + 1MB, otherwise you have
precisely NO way to obtain memory from the kernel for this DMA controller
- and that means you want the DMA zone to be sized to 1MB.  So _indirectly_
it's true that MAX_DMA_ADDRESS is the highest possible address for the DMA
zone.
--

From: Christoph Lameter
Date: Wednesday, October 1, 2008 - 7:50 am

Ok. I agree you need to add a __va to it in order to convert it back later.
That was not really the point. MAX_DMA_PFN can be used as a base to calculate
MAX_DMA_ADDRESS. Both are related and currently some arches go one way and
other do vice versa. Its particularly strange that x86_32 and x86_64 go
different ways. Can we unify that to one way only and put the definition of
MAX_DMA_ADDRESS in core code?

Also as a result of the long type used for a kernel "virtual" address we have

It is clearer because the association with ZONE_DMA is in the name now. One no
longer has the impression that MAX_DMA_ADDRESS is upper bound of any DMA
transfer in the system.

Maybe we should make these physical addresses to avoid the casts? That would
avoid casts in the bootmem allocator etc etc.

--

From: Russell King - ARM Linux
Date: Wednesday, October 1, 2008 - 8:02 am

Finally, we agree on something.  Yes, they should be phys addresses.
But not for the sake of getting rid of casts, but because that's what
the bootmem allocator _actually_ wants to have in the first place.

And, to do this, the following are going to have to be changed:

drivers/block/floppy.c: } else if ((unsigned long)current_req->buffer < MAX_DMA_ADDRESS) {
drivers/block/floppy.c:          * Do NOT use minimum() here---MAX_DMA_ADDRESS is 64 bits wide
drivers/block/floppy.c:             (MAX_DMA_ADDRESS -
drivers/net/3c505.c:    if ((unsigned long)(target + rlen) >= MAX_DMA_ADDRESS) {
drivers/net/3c505.c:    if ((unsigned long)(skb->data + nlen) >= MAX_DMA_ADDRESS || nlen != skb->len) {
drivers/net/cs89x0.c:                   if ((unsigned long) lp->dma_buff >= MAX_DMA_ADDRESS ||
drivers/net/wan/cosa.c: if (b+len >= MAX_DMA_ADDRESS)
drivers/parport/parport_pc.c:   if (end < MAX_DMA_ADDRESS) {
drivers/scsi/BusLogic.c:        if (HostAdapter->HostAdapterBusType == BusLogic_ISA_Bus && (void *) high_memory > (void *)
drivers/scsi/BusLogic.c:        if (HostAdapter->BIOS_Address > 0 && strcmp(HostAdapter->ModelName, "BT-445S") == 0 && strc
sound/oss/dmabuf.c:                 || end_addr >= (char *) (MAX_DMA_ADDRESS)) {
sound/oss/sscape.c:                 || end_addr >= (char *) (MAX_DMA_ADDRESS)) {

which probably want to do the check in the phys address space anyway.
--

From: Christoph Lameter
Date: Thursday, October 2, 2008 - 9:49 am

Right. Lets do it. MAX_DMA32 needs to be changed too.

While you are at it: Could you make the association with the zones clearer?

MAX_ZONE_DMA32_ADDRESS MAX_ZONE_DMA_ADDRESS, MAX_ZONE_DMA_PFN,
MAX_ZONE_DMA32_PFN ....
--

From: Russell King - ARM Linux
Date: Thursday, October 2, 2008 - 12:06 pm

If you want me to do it, expect it in about 6 to 12 months time.
--

Previous thread: kernel.org missing .1 incremental patches by Andrew Lyon on Tuesday, September 30, 2008 - 11:51 am. (3 messages)

Next thread: [RFC patch 0/3] signals: add rt_tgsigqueueinfo syscall by Thomas Gleixner on Tuesday, September 30, 2008 - 12:48 pm. (6 messages)