On Tue, 20 May 2008, Benjamin Herrenschmidt wrote:
Depends on what you define as "necessary". It's seem clear that I/O accessors
_no not_ need to be strictly ordered with respect to normal memory accesses,
by what's defined in memory-barriers.txt. So if by "necessary" you mean what
the Linux standard for I/O accessors requires (and what other archs provide),
then yes, they have the necessary ordering guarantees.
But, if you want them to be strictly ordered w.r.t to normal memory, that's
not the case.
For example, in something like:
u32 *dmabuf = kmalloc(...);
...
dmabuf[0] = 1;
out_be32(®s->dmactl, DMA_SEND_BUFFER);
dmabuf[0] = 2;
out_be32(®s->dmactl, DMA_SEND_BUFFER);
gcc might decide to optimize this code to:
out_be32(®s->dmactl, DMA_SEND_BUFFER);
out_be32(®s->dmactl, DMA_SEND_BUFFER);
dmabuf[0] = 2;
gcc will often not do this optimization, because there might be aliasing
between "®s->dmact" and "dmabuf", but it _can_ do it. gcc can't optimize
the two identical out_be32's into one, or re-order them if they were to
different registers, but it can move the normal memory accesses around them.
Here's a quick hack I stuck in a driver to test. compile with -save-temps and
check the resulting asm. gcc will do the optimization I described above.
static void __iomem *baz = (void*)0x1234;
static struct bar {
u32 bar[256];
} bar;
void foo(void) {
bar.bar[0] = 44;
out_be32(baz+100, 200);
bar.bar[0] = 45;
out_be32(baz+101, 201);
}
--
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| David Newall | Re: Slow DOWN, please!!! |
| Peter Zijlstra | [PATCH 00/23] per device dirty throttling -v8 |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | [GIT]: Networking |
| Natalie Protasevich | [BUG] New Kernel Bugs |
git: | |
