Re: [PATCH] mark read_crX() asm code as volatile

Previous thread: [patch] menuconfig - lift the fs menu by Jan Engelhardt on Tuesday, October 2, 2007 - 9:51 am. (3 messages)

Next thread: [PATCH] Consolidate IPC namespace cleanup functions by Pavel Emelyanov on Tuesday, October 2, 2007 - 10:17 am. (2 messages)
To: Andrew Morton <akpm@...>, Andi Kleen <ak@...>, Linux Kernel Mailing List <linux-kernel@...>, <devel@...>
Date: Tuesday, October 2, 2007 - 10:08 am

Some gcc versions (I checked at least 4.1.1 from RHEL5 & 4.1.2 from gentoo)
can generate incorrect code with read_crX()/write_crX() functions mix up,
due to cached results of read_crX().

The small app for x8664 below compiled with -O2 demonstrates this
(i686 does the same thing):

------------------- cut -----------------------
static inline unsigned long read_cr3(void)
{
unsigned long cr3;
asm("movq %%cr3,%0" : "=r" (cr3));
return cr3;
}

static inline void write_cr3(unsigned long val)
{
asm volatile("movq %0,%%cr3" :: "r" (val) : "memory");
}

void main()
{
unsigned long c;
c = read_cr3();
write_cr3(c | 0x80);
c = read_cr3();
write_cr3(c | 0x100);
}
------------------- cut -----------------------

# objdump -dr tst
....
0000000000400430 <main>:
400430: 0f 20 d8 mov %cr3,%rax
400433: 48 89 c2 mov %rax,%rdx
400436: 80 ca 80 or $0x80,%dl
400439: 0f 22 da mov %rdx,%cr3
40043c: 80 cc 01 or $0x1,%ah
40043f: 0f 22 d8 mov %rax,%cr3
400442: c3 retq
....

As one can notice, cr3 value is incorrectly read only once,
and finally updated with only one bit set.

So better be on the safe side and mark all asm statements
in read_crX() functions as volatile which helps.
i686 already has most of these functions marked as volatile already.

I faced this bug myself in i686 arch code when did code
rearrangement in 2.6.18.

Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Acked-By: Pavel Emelianov <xemul@openvz.org>

---

asm-i386/system.h | 2 +-
asm-x86_64/system.h | 8 ++++----
2 files changed, 5 insertions(+), 5 deletions(-)

--- ./include/asm-i386/system.h.ve4321 2007-10-02 17:09:53.000000000 +0400
+++ ./include/asm-i386/system.h 2007-10-02 17:39:40....

To: Kirill Korotaev <dev@...>
Cc: Andrew Morton <akpm@...>, Andi Kleen <ak@...>, Linux Kernel Mailing List <linux-kernel@...>, <devel@...>
Date: Tuesday, October 2, 2007 - 11:28 am

On Tue, 02 Oct 2007 18:08:32 +0400

I'm not so sure volatile is the right answer, as compared to giving the
asm more strict contraints....

asm volatile tends to mean something else than "the result has
changed"....

-

To: Arjan van de Ven <arjan@...>
Cc: Kirill Korotaev <dev@...>, Andrew Morton <akpm@...>, Andi Kleen <ak@...>, Linux Kernel Mailing List <linux-kernel@...>, <devel@...>, <cebbert@...>, <hpa@...>, <nickpiggin@...>
Date: Wednesday, October 3, 2007 - 4:25 am

Arjan,

I can experiment with any constraints if you suggest which one.

From our experiments with gcc, it compares asm strings (sic!!!) to find matches
to be merged! Sigh...
Below are 2 programs which differ in one space in read_cr3_b() asm statement.
The first one compiles incorrectly, while 2nd one - correctly.

My personal feeling is that comparing asm strings is simply a "misfeature".

-------------------------- cut ----------------------------
static inline unsigned long read_cr3_a(void)
{
unsigned long cr3;
asm("movq %%cr3,%0" : "=r" (cr3));
return cr3;
}

static inline unsigned long read_cr3_b(void)
{
unsigned long cr3;
asm("movq %%cr3,%0" : "=r" (cr3));
return cr3;
}

static inline void write_cr3(unsigned long val)
{
asm volatile("movq %0,%%cr3" :: "r" (val) : "memory");
}

void main()
{
unsigned long c;
c = read_cr3_a();
write_cr3(c | 0x80);
c = read_cr3_b();
write_cr3(c | 0x100);
}
-------------------------- cut ----------------------------

-------------------------- cut ----------------------------
static inline unsigned long read_cr3_a(void)
{
unsigned long cr3;
asm("movq %%cr3,%0" : "=r" (cr3));
return cr3;
}

static inline unsigned long read_cr3_b(void)
{
unsigned long cr3;
asm("movq %%cr3,%0" : "=r" (cr3));
return cr3;
}

static inline void write_cr3(unsigned long val)
{
asm volatile("movq %0,%%cr3" :: "r" (val) : "memory");
}

void main()
{
unsigned long c;
c = read_cr3_a();
write_cr3(c | 0x80);
c = read_cr3_b();
write_cr3(c | 0x100);
}
-------------------------- cut ----------------------------

Kirill

-

To: Arjan van de Ven <arjan@...>
Cc: Kirill Korotaev <dev@...>, Andrew Morton <akpm@...>, Andi Kleen <ak@...>, Linux Kernel Mailing List <linux-kernel@...>, <devel@...>
Date: Tuesday, October 2, 2007 - 3:21 pm

One of the aspect of volatility is "the result will change in ways you
(gcc) just don't understand."

-hpa
-

To: Arjan van de Ven <arjan@...>
Cc: Kirill Korotaev <dev@...>, Andrew Morton <akpm@...>, Andi Kleen <ak@...>, Linux Kernel Mailing List <linux-kernel@...>, <devel@...>
Date: Tuesday, October 2, 2007 - 2:27 pm

It means "don't eliminate this code if it's reachable" which should be
just enough for this case. But it could still be reordered in some cases
that could break, I think.

This should work because the result gets used before reading again:

read_cr3(a);
write_cr3(a | 1);
read_cr3(a);

But this might be reordered so that b gets read before the write:

read_cr3(a);
write_cr3(a | 1);
read_cr3(b);

?
-

To: Chuck Ebbert <cebbert@...>
Cc: Arjan van de Ven <arjan@...>, Kirill Korotaev <dev@...>, Andrew Morton <akpm@...>, Andi Kleen <ak@...>, Linux Kernel Mailing List <linux-kernel@...>, <devel@...>
Date: Tuesday, October 2, 2007 - 8:14 am

I don't see how, as write_cr3 clobbers memory.
-

To: Nick Piggin <nickpiggin@...>
Cc: Chuck Ebbert <cebbert@...>, Arjan van de Ven <arjan@...>, Kirill Korotaev <dev@...>, Andrew Morton <akpm@...>, Andi Kleen <ak@...>, Linux Kernel Mailing List <linux-kernel@...>, <devel@...>
Date: Wednesday, October 3, 2007 - 2:18 am

Because read_cr3() doesn't depend on memory, and b could be stored in a
register.

-hpa
-

To: H. Peter Anvin <hpa@...>
Cc: Chuck Ebbert <cebbert@...>, Arjan van de Ven <arjan@...>, Kirill Korotaev <dev@...>, Andrew Morton <akpm@...>, Andi Kleen <ak@...>, Linux Kernel Mailing List <linux-kernel@...>, <devel@...>
Date: Tuesday, October 2, 2007 - 11:49 am

How does the compiler know it doesn't depend on memory?

How do you say it depends on memory? You really need something
as heavy as volatile?
-

To: Nick Piggin <nickpiggin@...>
Cc: H. Peter Anvin <hpa@...>, Chuck Ebbert <cebbert@...>, Arjan van de Ven <arjan@...>, Kirill Korotaev <dev@...>, Andrew Morton <akpm@...>, Linux Kernel Mailing List <linux-kernel@...>, <devel@...>
Date: Wednesday, October 3, 2007 - 4:45 am

When it has no m (or equivalent like g) constrained argument

You could do a memory clobber, but it would be heavier than the volatile
because the memory clobber clobbers all cached variables. volatile essentially
just says "don't remove; has side effects". Normally gcc does that automatically
for something without outputs, but this one has.

Besides a CRx access does not actually clobber memory.

-Andi
-

To: Kirill Korotaev <dev@...>
Cc: Andrew Morton <akpm@...>, Linux Kernel Mailing List <linux-kernel@...>, <devel@...>
Date: Tuesday, October 2, 2007 - 10:17 am

added thanks

-Andi

-

Previous thread: [patch] menuconfig - lift the fs menu by Jan Engelhardt on Tuesday, October 2, 2007 - 9:51 am. (3 messages)

Next thread: [PATCH] Consolidate IPC namespace cleanup functions by Pavel Emelyanov on Tuesday, October 2, 2007 - 10:17 am. (2 messages)