background on ioremap, cacheing, cache coherency on x86

Previous thread: pcmcia_ioctl.c compile error by Adrian Bunk on Tuesday, April 29, 2008 - 11:27 am. (16 messages)

Next thread: Additional kconfig targets (cloneconfig, nonint_oldconfig etc) by Sam Ravnborg on Tuesday, April 29, 2008 - 11:35 am. (22 messages)
From: Arjan van de Ven
Date: Tuesday, April 29, 2008 - 11:32 am

Amids some heavy flaming, it's clear that there is a lot of confusion on how
cachability and ioremap cooperate on x86 on a hardware level, and how this
interacts with Linux (both before 2.6.24 and in current trees).
This email tries to describe the various aspects and constraints involved,
in the hope to take away the confusion and to make clear how Linux works,
both in the past and going forward.
(without degrading to flames again, lets keep THIS thread technical please)

Cachable.. what does it mean?
-----------------------------
For the CPU, if a piece of memory is cachable (how it decides that I'll cover
later), it means that
1) The CPU is allowed to read the data from the memory into its cache at any
    point in time, even if the program will never actually read the memory.
    Hardware prefetching, speculative execution etc etc all can cause the cpu
    to get content into its caches if it's cachable. The CPU is also allowed
    to hold on to this content as long as it wants, until something in the
    system forces the CPU to remove the cache line from its cache.
2) The CPU is allowed to write the contents from its cache back to memory at
    any point in time, even if the program will never actually write to the
    cacheline; the later is the result of speculation etc; what will be written
    in that case is the clean cacheline that was in the cache.
    (AMD cpus seem to do this relatively aggressively; Intel cpus may or may
    not do this)
3) The CPU is allowed to write a full cacheline without having read it; it
    will just get the cacheline exclusive in this case.
4) The CPU is allowed to hold on to written cache lines without writing them
    back for as long as it wants, until something in the cache coherency
    protocol forces a commit or discard.

Practically speaking this means that a memory location that the cpu sees as
cachable, needs to be on some device that takes part of the cache coherency
device or a very special case (such as ROM) that:
  - The ...
Previous thread: pcmcia_ioctl.c compile error by Adrian Bunk on Tuesday, April 29, 2008 - 11:27 am. (16 messages)

Next thread: Additional kconfig targets (cloneconfig, nonint_oldconfig etc) by Sam Ravnborg on Tuesday, April 29, 2008 - 11:35 am. (22 messages)