Re: why choose 896MB to the start point of ZONE_HIGHMEM

Previous thread: by Vince Weaver on Tuesday, April 6, 2010 - 7:01 am. (2 messages)

Next thread: by Siarhei Liakh on Tuesday, April 6, 2010 - 7:50 am. (1 message)
From: hayfeng Lee
Subject:
Date: Tuesday, April 6, 2010 - 7:37 am

hello,every one.
I have a question:
Why does linux choose 896MB to do a start point of ZONE_HIGHMEM and
the end point of ZONE_NORMAL. Just for experience?
What is the advantages?
--

From: Joel Fernandes
Subject:
Date: Tuesday, April 6, 2010 - 8:02 am

Hi Hayfeng,


This is not an advantage but a limitation of 32 bit processor and
architecture. Only physical memory in first 896MB  is directly mapped
to the kernel virtual memory address space. This is called
ZONE_NORMAL. To access any physical memory in ZONE_HIGHMEM, the kernel
has to set up page table entries to indirectly map the physical memory
into a virtual memory address (I think around 128MB or so worth page
table entries are reused for this purpose). On the other hand, on 64
bit architectures, the entire physical memory is directly mapped and
accessible to the kernel. ZONE_HIGHMEM doesn't exist on 64 bit.

Take the above with a grain of salt, someone with a better knowledge
about this intrusive topic can be give a more detailed explanation :)

Hope this helps, thanks,
-Joel
--

From: H. Peter Anvin
Date: Tuesday, April 6, 2010 - 11:17 am

The ELF ABI specifies that user space has 3 GB available to it.  That
leaves 1 GB for the kernel.  The kernel, by default, uses 128 MB for I/O
mapping, vmalloc, and kmap support, which leaves 896 MB for LOWMEM.

All of these boundaries are configurable; with PAE enabled the user
space boundary has to be on a 1 GB boundary.

	-hpa
--

From: Frank Hu
Date: Tuesday, April 6, 2010 - 12:20 pm

the VM split is also configurable when building the kernel (for 32-bit
processors).
--

From: H. Peter Anvin
Date: Tuesday, April 6, 2010 - 12:44 pm

I did say "all these boundaries are configurable".  Rather explicitly.

	-hpa
--

From: Joel Fernandes
Date: Tuesday, April 6, 2010 - 1:01 pm

Hi Peter,


I thought the 896 MB was a hardware limitation on 32 bit architectures
and something that cannot be configured? Or am I missing something
here? Also the vm-splits refer to "virtual memory" . While ZONE_* and
the 896MB we were discussing refers to "physical memory". How then is
discussing about vm splits pertinent here?

Thanks,
-Joel
--

From: H. Peter Anvin
Date: Tuesday, April 6, 2010 - 1:04 pm

It's not a hardware limitation.  Rather, it has to do with how the 4 GB
of virtual address space is carved up.  LOWMEM specifically refers to
the amount of memory which is permanently mapped into the virtual
address space, whereas HIGHMEM is mapped in and out on demand -- a
fairly expensive operation.

	-hpa
--

From: H. Peter Anvin
Date: Tuesday, April 6, 2010 - 4:32 pm

If there is less than 896 MB of physical memory, the vmalloc region is
automatically extended (in your case, it will be 768 MB in size.)  There
will be no HIGHMEM in such a case, and if you are compiling your own
kernel you will gain considerable speed by disabling HIGHMEM support
completely.

This, of course, was the norm back when Linux was first created, and a
typical amount of memory was 8 MB or so.  That we'd have gigabytes of
memory seemed very distant at the time.

	-hpa
--

From: Xianghua Xiao
Date: Tuesday, April 6, 2010 - 6:47 pm

If the last 128MB out of the kernel 1GB space is used to for highmen,
meanwhile it's also used for IO/vmalloc, how does this work?

Xianghua

--

From: H. Peter Anvin
Date: Tuesday, April 6, 2010 - 7:09 pm

Not quite.

The vmalloc region is for *anything which is dynamically mapped*, which
includes I/O, vmalloc, and HIGHMEM (kmap).

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--

From: Xianghua Xiao
Date: Wednesday, April 7, 2010 - 5:10 am

On Wed, Apr 7, 2010 at 12:48 AM, Venkatram Tummala

Thanks Venkatram, do these sound right:

1. All HIGHMEM(physical address beyond 896MB) are kmapped back into
the last 128MB kernel "virtual" address space(using page tables stored
in the last 128MB physical address). That also implies it's a very
limited virtual space for large memory system and need do kunmap when
you're done with it(so you can kmap other physical memories in).
I'm not familiar with large-memory systems, not sure how kmap cope
with that using this limited 128M window assuming kernel is 1:3 split.

2. The last 128MB physical address can be used for page tables(kmap),
vmalloc, IO,etc

Regards,
Xianghua
--

From: H. Peter Anvin
Date: Wednesday, April 7, 2010 - 10:14 am

Wrong.  I have to say this thread has been just astonishing in the
amount of misinformation.

On MIPS32, userspace is 0-2 GB, kseg0 is 2.0-2.5 GB and kseg1 is 2.5-3.0
GB.  kseg2/3 (3.0-4.0 GB), which invokes the TLB, is used for the
vmalloc/iomap/kmap area.

LOWMEM has to fit inside kseg0, so LOWMEM is limited to 512 MB in thie
current Linux implementation.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--

From: Nobin Mathew
Date: Wednesday, April 7, 2010 - 7:23 pm

http://www.johnloomis.org/microchip/pic32/memory/memory.html

So what is the memory division here in mips, again 1:3?

kseg2 is already 1 GB address space?


-Nobin
--

From: H. Peter Anvin
Date: Tuesday, April 6, 2010 - 10:28 pm

Correct.

	-hpa
-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--

From: H. Peter Anvin
Date: Tuesday, April 6, 2010 - 11:04 pm

No, we still need page tables for the identity-mapped segment.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--

From: tek-life
Date: Wednesday, April 7, 2010 - 7:11 am

Dear Venkatram,

Thanks for your  hot heart and detailed explaination.

Your opinion is just  for the balance with choosing 896MB ?
Then I want to konw wheather is the decision for 896  based on a lot
of  experiments.
I think it is an important things .

Best wishes.

--

From: Frank Hu
Date: Tuesday, April 6, 2010 - 1:15 pm

thought that you can only configure how to split the VM like 1G/3G or
2G/2G. But the DMA zone size, the 128MB space for I/O is not
configurable. The NORMAL zone size will be deducted based on the VM
Split and some hard coded DMA zone and 128 MB space size.

I am not a guru in this space... so I might be wrong.
--

From: H. Peter Anvin
Date: Tuesday, April 6, 2010 - 1:18 pm

And you are.  The vmalloc zone (not DMA zone -- that's something
entirely different) is configurable via the vmalloc= kernel command line
option.

	-hpa
--

From: Krzysztof Halasa
Date: Wednesday, April 7, 2010 - 5:16 am

Only the DMA zone size if fixed - a hardware property of PC/AT-style DMA
controllers and ISA bus.
-- 
Krzysztof Halasa
--

Previous thread: by Vince Weaver on Tuesday, April 6, 2010 - 7:01 am. (2 messages)

Next thread: by Siarhei Liakh on Tuesday, April 6, 2010 - 7:50 am. (1 message)