Amids some heavy flaming, it's clear that there is a lot of confusion on how
cachability and ioremap cooperate on x86 on a hardware level, and how this
interacts with Linux (both before 2.6.24 and in current trees).
This email tries to describe the various aspects and constraints involved,
in the hope to take away the confusion and to make clear how Linux works,
both in the past and going forward.
(without degrading to flames again, lets keep THIS thread technical please)
Cachable.. what does it mean?
-----------------------------
For the CPU, if a piece of memory is cachable (how it decides that I'll cover
later), it means that
1) The CPU is allowed to read the data from the memory into its cache at any
point in time, even if the program will never actually read the memory.
Hardware prefetching, speculative execution etc etc all can cause the cpu
to get content into its caches if it's cachable. The CPU is also allowed
to hold on to this content as long as it wants, until something in the
system forces the CPU to remove the cache line from its cache.
2) The CPU is allowed to write the contents from its cache back to memory at
any point in time, even if the program will never actually write to the
cacheline; the later is the result of speculation etc; what will be written
in that case is the clean cacheline that was in the cache.
(AMD cpus seem to do this relatively aggressively; Intel cpus may or may
not do this)
3) The CPU is allowed to write a full cacheline without having read it; it
will just get the cacheline exclusive in this case.
4) The CPU is allowed to hold on to written cache lines without writing them
back for as long as it wants, until something in the cache coherency
protocol forces a commit or discard.
Practically speaking this means that a memory location that the cpu sees as
cachable, needs to be on some device that takes part of the cache coherency
device or a very special case (such as ROM) that:
- The ...