By popular demand, I've redone the patchset to include volatile casts in atomic_set as well. I've also converted the macros to inline functions, to help catch type mismatches at compile time. This will do weird things on ia64 without Andreas Schwab's fix: http://lkml.org/lkml/2007/8/10/410 Notably absent is a patch for powerpc. I expect Segher Boessenkool's assembly implementation should suffice there: http://lkml.org/lkml/2007/8/10/470 Thanks to all who commented on previous incarnations. -- Chris Snook -
From: Chris Snook <csnook@redhat.com>
Document proper use of volatile for atomic_t operations.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/Documentation/atomic_ops.txt 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/Documentation/atomic_ops.txt 2007-08-13 03:36:43.000000000 -0400
@@ -12,13 +12,20 @@
C integer type will fail. Something like the following should
suffice:
- typedef struct { volatile int counter; } atomic_t;
+ typedef struct { int counter; } atomic_t;
+
+ Historically, counter has been declared as a volatile int. This
+is now discouraged in favor of explicitly casting it as volatile where
+volatile behavior is required. Most architectures will only require such
+a cast in atomic_read() and atomic_set(), as well as their 64-bit versions
+if applicable, since the more complex atomic operations directly or
+indirectly use assembly that results in volatile behavior.
The first operations to implement for atomic_t's are the
initializers and plain reads.
#define ATOMIC_INIT(i) { (i) }
- #define atomic_set(v, i) ((v)->counter = (i))
+ #define atomic_set(v, i) (*(volatile int *)&(v)->counter = (i))
The first macro is used in definitions, such as:
@@ -38,7 +45,7 @@
Next, we have:
- #define atomic_read(v) ((v)->counter)
+ #define atomic_read(v) (*(volatile int *)&(v)->counter)
which simply reads the current value of the counter.
-
Looks good, as did a once-over on the arch-specific files. Good stuff!!! -
volatile means that there is some vague notion of "read it now". But that really does not exist. Instead we control visibility via barriers (smp_wmb, smp_rmb). Would it not be best to not have volatile at all in atomic operations and let the barriers do the work? -
From my reply in the other thread... But barriers force a flush of *everything* in scope, which we generally don't want. On the other hand, we pretty much always want to flush atomic_* operations. One way or another, we should be restricting the volatile behavior to the thing that needs it. On most architectures, this patch set just moves that from the declaration, where it is considered harmful, to the use, where it is considered an occasional necessary evil. If you really, *really* distrust the compiler that much, you shouldn't be using barrier, since that uses volatile under the hood too. You should just go ahead and implement the atomic operations in assembler, like Segher Boessenkool did for powerpc in response to my previous patchset. -- Chris -
Maybe we need two read functions? One volatile, one not? The atomic_read()s that I have in slub really do not care about when the variables are read. And if volatile creates overhead then I rather not have it. -
If we're going to do this, and I don't think we need to, I'd prefer that atomic_read() be volatile, and something like atomic_read_opt() be non-volatile, A single volatile access is no more expensive than a non-volatile access. It's when you have dependencies that you start to see overhead. If you're doing a bunch of atomic operations on the same atomic_t in quick succession, then you will see some overhead. Of course, if you're doing that, I think you have a design problem. On modern, register-rich CPUs with cache latencies of a couple clock cycles, volatile generally isn't as much of a performance hit as it used to be. I think that going out of your way to avoid it would be premature optimization on modern hardware. -- Chris -
The overhead due to volatile access is -way- small. Not like barrier(), which can flush out a fair fraction of the machine registers. Thanx, Paul -
> But barriers force a flush of *everything* in scope, Puh-lease. I DO NOT DISTRUST THE COMPILER, I just don't assume it will do whatever I would like it to do without telling it. It's a machine you know, and it is very well documented. (And most barriers don't (need to) use volatile). Implementing the atomic ops in asm loses exactly *no* semantics, and it doesn't add restrictions either; it does allow you however to access an atomic_t with normal loads/stores independently, if you so choose. The only valid issue brought up so far is Russell King's comment that GCC cannot schedule the machine instructions in and around an asm() perfectly, it doesn't look inside the asm() after all; but the situation isn't quite as sever as he suggests (GCC _does_ know you are performing loads/stores, etc.); more about that later, when I've finished researching the current state of that stuff. Segher -
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on alpha.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-alpha/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-alpha/atomic.h 2007-08-13 05:00:36.000000000 -0400
@@ -14,21 +14,35 @@
/*
- * Counter is volatile to make sure gcc doesn't try to be clever
- * and move things around on us. We need to use _exactly_ the address
- * the user gave us, not some alias that contains the same information.
+ * Make sure gcc doesn't try to be clever and move things around
+ * on us. We need to use _exactly_ the address the user gave us,
+ * not some alias that contains the same information.
*/
-typedef struct { volatile int counter; } atomic_t;
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { int counter; } atomic_t;
+typedef struct { long counter; } atomic64_t;
#define ATOMIC_INIT(i) ( (atomic_t) { (i) } )
#define ATOMIC64_INIT(i) ( (atomic64_t) { (i) } )
-#define atomic_read(v) ((v)->counter + 0)
-#define atomic64_read(v) ((v)->counter + 0)
+static __inline__ int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter + 0;
+}
+
+static __inline__ long atomic64_read(atomic64_t *v)
+{
+ return *(volatile long *)&v->counter + 0;
+}
-#define atomic_set(v,i) ((v)->counter = (i))
-#define atomic64_set(v,i) ((v)->counter = (i))
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
+
+static __inline__ void atomic64_set(atomic64_t *v, long i)
+{
+ *(volatile long *)&v->counter = i;
+}
/*
* To get proper branch prediction for the main line, we must branch
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on arm.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-arm/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-arm/atomic.h 2007-08-13 04:44:50.000000000 -0400
@@ -14,13 +14,16 @@
#include <linux/compiler.h>
#include <asm/system.h>
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) { (i) }
#ifdef __KERNEL__
-#define atomic_read(v) ((v)->counter)
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
#if __LINUX_ARM_ARCH__ >= 6
-
... and in the rest of include/asm-arm/atomic.h: #else /* ARM_ARCH_6 */ #include <asm/system.h> #ifdef CONFIG_SMP #error SMP not supported on pre-ARMv6 CPUs #endif #define atomic_set(v,i) (((v)->counter) = (i)) Seems you missed that. Grep is a wonderful tool. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
D'oh! Try this.
-- Chris
--- linux-2.6.23-rc3-orig/include/asm-arm/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-arm/atomic.h 2007-08-13 08:39:52.000000000 -0400
@@ -14,13 +14,16 @@
#include <linux/compiler.h>
#include <asm/system.h>
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) { (i) }
#ifdef __KERNEL__
-#define atomic_read(v) ((v)->counter)
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
#if __LINUX_ARM_ARCH__ >= 6
@@ -122,7 +125,10 @@ static inline void atomic_clear_mask(uns
#error SMP not supported on pre-ARMv6 CPUs
#endif
-#define atomic_set(v,i) (((v)->counter) = (i))
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
static inline int atomic_add_return(int i, atomic_t *v)
{
-
Thanks. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on avr32.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-avr32/atomic.h 2007-08-13 03:14:13.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-avr32/atomic.h 2007-08-13 04:48:25.000000000 -0400
@@ -16,11 +16,18 @@
#include <asm/system.h>
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) { (i) }
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v, i) (((v)->counter) = i)
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
/*
* atomic_sub_return - subtract the atomic variable
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on blackfin.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-blackfin/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-blackfin/atomic.h 2007-08-13 05:21:07.000000000 -0400
@@ -18,8 +18,15 @@ typedef struct {
} atomic_t;
#define ATOMIC_INIT(i) { (i) }
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v, i) (((v)->counter) = i)
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
static __inline__ void atomic_add(int i, atomic_t * v)
{
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on cris.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-cris/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-cris/atomic.h 2007-08-13 05:23:37.000000000 -0400
@@ -11,12 +11,19 @@
* resource counting etc..
*/
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) { (i) }
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v,i) (((v)->counter) = (i))
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
/* These should be written in asm but we do it in C for now. */
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on frv.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-frv/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-frv/atomic.h 2007-08-13 05:27:08.000000000 -0400
@@ -40,8 +40,16 @@ typedef struct {
} atomic_t;
#define ATOMIC_INIT(i) { (i) }
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v, i) (((v)->counter) = (i))
+
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
#ifndef CONFIG_FRV_OUTOFLINE_ATOMIC_OPS
static inline int atomic_add_return(int i, atomic_t *v)
-
Acked-By: David Howells <dhowells@redhat.com> -
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on h8300.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-h8300/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-h8300/atomic.h 2007-08-13 05:29:05.000000000 -0400
@@ -9,8 +9,15 @@
typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) { (i) }
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v, i) (((v)->counter) = i)
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
#include <asm/system.h>
#include <linux/kernel.h>
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on i386.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-i386/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-i386/atomic.h 2007-08-13 05:31:45.000000000 -0400
@@ -25,7 +25,10 @@ typedef struct { int counter; } atomic_t
*
* Atomically reads the value of @v.
*/
-#define atomic_read(v) ((v)->counter)
+static __inline__ int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
/**
* atomic_set - set atomic variable
@@ -34,7 +37,10 @@ typedef struct { int counter; } atomic_t
*
* Atomically sets the value of @v to @i.
*/
-#define atomic_set(v,i) (((v)->counter) = (i))
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
/**
* atomic_add - add integer to atomic variable
-
From: Chris Snook <csnook@redhat.com> Use volatile consistently in atomic.h on ia64. This will do weird things without Andreas Schwab's fix: http://lkml.org/lkml/2007/8/10/410 Signed-off-by: Chris Snook <csnook@redhat.com> --- linux-2.6.23-rc3-orig/include/asm-ia64/atomic.h 2007-07-08 19:32:17.000000000 -0400 +++ linux-2.6.23-rc3/include/asm-ia64/atomic.h 2007-08-13 05:38:27.000000000 -0400 @@ -19,19 +19,34 @@ /* * On IA-64, counter must always be volatile to ensure that that the - * memory accesses are ordered. + * memory accesses are ordered. This must be enforced each time that + * counter is read or written. */ -typedef struct { volatile __s32 counter; } atomic_t; -typedef struct { volatile __s64 counter; } atomic64_t; +typedef struct { __s32 counter; } atomic_t; +typedef struct { __s64 counter; } atomic64_t; #define ATOMIC_INIT(i) ((atomic_t) { (i) }) #define ATOMIC64_INIT(i) ((atomic64_t) { (i) }) -#define atomic_read(v) ((v)->counter) -#define atomic64_read(v) ((v)->counter) +static inline __s32 atomic_read(atomic_t *v) +{ + return *(volatile __s32 *)&v->counter; +} + +static inline void atomic_set(atomic_t *v, __s32 i) +{ + *(volatile __s32 *)&v->counter = i; +} -#define atomic_set(v,i) (((v)->counter) = (i)) -#define atomic64_set(v,i) (((v)->counter) = (i)) +static inline __s64 atomic64_read(atomic64_t *v) +{ + return *(volatile __s64 *)&v->counter; +} + +static inline void atomic64_set(atomic64_t *v, __s64 i) +{ + *(volatile __s64 *)&v->counter = i; +} static __inline__ int ia64_atomic_add (int i, atomic_t *v) -
The build is very noisy with the inline versions of atomic_{read,set}
and their 64-bit siblings. Here are the prime culprits (some of them
repeat >100 times).
include/linux/skbuff.h:521: warning: passing arg 1 of `atomic_read' discards qualifiers from pointer target type
include/net/sock.h:1244: warning: passing arg 1 of `atomic_read' discards qualifiers from pointer target type
include/net/tcp.h:958: warning: passing arg 1 of `atomic_read' discards qualifiers from pointer target type
mm/slub.c:3115: warning: passing arg 1 of `atomic_read' from incompatible pointer type
mm/slub.c:3250: warning: passing arg 1 of `atomic_read' from incompatible pointer type
mm/slub.c:3286: warning: passing arg 1 of `atomic_read' from incompatible pointer type
The inline versions also result in some structural changes in
the object file that make it difficult to compare with the
original. Text size is 96 bytes smaller ... but even after
I use sed(1) to exclude the most obvious instructions that
differ, I still find big blocks of code with changes. Perhaps
even more surprising there are entire functions that are
optimized out in either the 'before' or 'after' binary.
E.g. lookup_pi_state() was optimized away (or completely
inlined?) before this patch, but the function appears as
standalone in the 'after' version. The reverse is true for
fixup_pi_state_owner().
The binary does boot ... but I haven't run any tests to see whether
there are any problems.
-Tony
-
Part of the motivation for using inline functions was to expose places where we've been lazy, so this isn't unexpected. We need to work on clearing up those IIRC, when you applied a version which used macros instead, there was no change. It would seem that inlining changed the optimization behavior of the compiler. If you turn down the optimization level, do the macro and inline versions look The only part of the patch that I was really worried about breaking anything was the removal of the volatile declaration, in case there was some other access that needed a cast. Since the macro version didn't change anything, that's covered. Converting from a macro to an inline shouldn't really change anything in this case, except perhaps for how the compiler optimizes it. If something *does* break, I'd suspect compiler bugs. -- Chris -
I re-tried the macros ... the three warnings from mm/slub.c all result in broken code ... and quite rightly too, they all come from code that does: atomic_read(&n->nr_slabs) But the nr_slabs field is an atomic_long_t, so we shouldn't be using atomic_read(). I didn't spot these last time around because I was using slab, not slub for the previous build. I think that I'll run into other build issues if I turn down the optimization level (there are lots of places where the kernel relies -Tony -
Hmmmm... Strange that this did not cause failures before on any other
platforms?
Fix atomic_read's in slub
Signed-off-by: Christoph Lameter <clameter@sgi.com>
diff --git a/mm/slub.c b/mm/slub.c
index 69d02e3..0c106d7 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3112,7 +3112,7 @@ static int list_locations(struct kmem_cache *s, char *buf,
unsigned long flags;
struct page *page;
- if (!atomic_read(&n->nr_slabs))
+ if (!atomic_long_read(&n->nr_slabs))
continue;
spin_lock_irqsave(&n->list_lock, flags);
@@ -3247,7 +3247,7 @@ static unsigned long slab_objects(struct kmem_cache *s,
}
if (flags & SO_FULL) {
- int full_slabs = atomic_read(&n->nr_slabs)
+ int full_slabs = atomic_long_read(&n->nr_slabs)
- per_cpu[node]
- n->nr_partial;
@@ -3283,7 +3283,7 @@ static int any_slab_objects(struct kmem_cache *s)
for_each_node(node) {
struct kmem_cache_node *n = get_node(s, node);
- if (n->nr_partial || atomic_read(&n->nr_slabs))
+ if (n->nr_partial || atomic_long_read(&n->nr_slabs))
return 1;
}
return 0;
-
Prior to the patch in question, atomic_read was a macro. I didn't use slub in my cursory testing. Tony had ia64 under a microscope because of the tricky memory access ordering semantics of that architecture. -- Chris -
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on m32r.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-m32r/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-m32r/atomic.h 2007-08-13 05:42:09.000000000 -0400
@@ -22,7 +22,7 @@
* on us. We need to use _exactly_ the address the user gave us,
* not some alias that contains the same information.
*/
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) { (i) }
@@ -32,7 +32,10 @@ typedef struct { volatile int counter; }
*
* Atomically reads the value of @v.
*/
-#define atomic_read(v) ((v)->counter)
+static __inline__ int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
/**
* atomic_set - set atomic variable
@@ -41,7 +44,10 @@ typedef struct { volatile int counter; }
*
* Atomically sets the value of @v to @i.
*/
-#define atomic_set(v,i) (((v)->counter) = (i))
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
/**
* atomic_add_return - add integer to atomic variable and return it
-
From: Chris Snook <csnook@redhat.com> Thanks, Acked-by: Hirokazu Takata <takata@linux-m32r.org> -
Hi, Chris,
From: Hirokazu Takata <takata@linux-m32r.org>
Hmmm.. It seems my reply was overhasty.
Applying the above patch, I have many warning messages like this:
<-- snip -->
...
CC kernel/sched.o
In file included from /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/netlink.h:139,
from /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/genetlink.h:4,
from /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/net/genetlink.h:4,
from /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/taskstats_kern.h:12,
from /project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/delayacct.h:21,
from /project/m32r-linux/kernel/work/linux-2.6_dev.git/kernel/sched.c:61:
/project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/skbuff.h: In function 'skb_shared':
/project/m32r-linux/kernel/work/linux-2.6_dev.git/include/linux/skbuff.h:521: warning: passing argument 1 of 'atomic_read' discards qualifiers from pointer target type
...
<-- snip -->
In this case, it is because stb_shared() is defined with a parameter with
"const" qualifier, in include/linux/skbuff.h.
static inline int skb_shared(const struct sk_buff *skb)
{
return atomic_read(&skb->users) != 1;
}
I think the parameter of atomic_read() should have "const"
qualifier to avoid these warnings, and IMHO this modification might be
worth applying on other archs.
Here is an additional patch to revise the previous one for m32r.
I also tried to rewrite it with inline asm code, but the kernel text size
bacame roughly 2kB larger. So, I prefer C version.
Thanks,
-- Takata
[PATCH] m32r: Add "const" qualifier to the parameter of atomic_read()
Update atomic_read() to avoid the following warning of gcc-4.1.x:
warning: passing argument 1 of 'atomic_read' discards qualifiers
from pointer target type
Signed-off-by: Hirokazu Takata ...I'll incorporate this change if we get enough consensus to justify a re-re-re-submit. Since the patch is intended to be a functional no-op on m32r, You're not the only arch maintainer who prefers doing it in C. If you trust your compiler (a big "if", apparently), inline asm only improves code generation if you have a whole bunch of general purpose registers for the optimizer to play with. -- Chris -
Could you send me the source code diff between the two versions you tested? 2kB difference is way too much, the asm version should No. The only real difference between the *(volatile *)& version and the volatile asm() version is that the volatile asm() version has defined semantics. There will be some code generation differences too, but they should be in the noise, unless GCC generates really bad code for either case. We know it sometimes does that for the *(volatile *)& thing; if the asm() version does something bad, I'd like to know about that too. Segher -
Segher, can you please drop out of this discussion? Your points are all wrong. The inline asm version has the EXACT SAME PROBLEM as the "volatile" version has: it means that the compiler is unable to combine trivial instructions. There's no way that is acceptable. So why the *hell* you'd expect the asm version to be smaller, I can't imagine. It's obvkously not going to be true. So here's a clue to everybody in this thread: THE CURRENT x86 BEHAVIOUR IS THE CORRECT ONE where we don't have any "volatile" and no inline asm and no barriers, and we let the compiler do the right thing. If the code needs barriers, the code should damn well add them. It's worked for the past 8 months, on the most common platform there is. Stop this *idiotic* thread already. Linus -
This simply isn't true. The compiler *can* combine asm stuff:
typedef struct { int counter; } atomic_t;
static inline __attribute__((pure)) int atomic_read(const atomic_t *v)
{
int x;
asm("ld %0,@%1" : "=r"(x) : "r"(&v->counter), "m"(v->counter));
return x;
}
int f(atomic_t *x)
{
return atomic_read(x) + atomic_read(x);
}
int g(atomic_t *x)
{
return 0 * atomic_read(x);
}
generates
f:
ld r0,@r0
slli r0,#1
jmp lr
g:
ldi r0,#0
I expect it to be smaller than the current code, which uses the
"big hammer" volatile version. We're talking about m32r here,
not x86. Even when using "volatile asm" it shouldn't generate
much bigger code.
Anyhow, I answered my own question: on m32r, you cannot use "m"
with ld/st insns, since autoincrement modes don't work there (the
assembler complains, at least). So you have to force the address
Sure. I'm not suggesting otherwise.
Segher
-
That's not precisely combining asm stuff. The compiler is ditching a whole function because you've told it it can cache the result. David -
Please Segher, just shut up. The combining, which I've mentioned *multiple*times* is if (atomic_read(&x) <= 1) and dammit, if that doesn't result in a *single* instruction, the code generation is pure and utter crap. It should result in cmpl $1,offset(reg) and nothing else. And there is no way in hell you are doing that with "atomic_read()" being inline asm. So can you now just go away? Linus -
Sorry, Linus, I don't agree. The whole point of 'volatile' is to say to the compiler, "DO NOT OPTIMIZE THIS. What you think is harmless may break things you do not and should not understand." The combination of the read and the compare into a single operation is a compiler optimization. While it would not be unreasonable to still allow this optimization (since I cannot think of any situations in which it could be harmful), it is just as reasonable not to. DS -
Looks like peephole optimization at work.
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on m68knommu.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-m68knommu/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-m68knommu/atomic.h 2007-08-13 05:47:46.000000000 -0400
@@ -15,8 +15,15 @@
typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) { (i) }
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v, i) (((v)->counter) = i)
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
static __inline__ void atomic_add(int i, atomic_t *v)
{
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on m68k.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-m68k/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-m68k/atomic.h 2007-08-13 05:45:43.000000000 -0400
@@ -16,8 +16,15 @@
typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) { (i) }
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v, i) (((v)->counter) = i)
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
static inline void atomic_add(int i, atomic_t *v)
{
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on mips.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-mips/atomic.h 2007-08-13 03:14:13.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-mips/atomic.h 2007-08-13 05:52:14.000000000 -0400
@@ -20,7 +20,7 @@
#include <asm/war.h>
#include <asm/system.h>
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) { (i) }
@@ -30,7 +30,10 @@ typedef struct { volatile int counter; }
*
* Atomically reads the value of @v.
*/
-#define atomic_read(v) ((v)->counter)
+static __inline__ int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
/*
* atomic_set - set atomic variable
@@ -39,7 +42,10 @@ typedef struct { volatile int counter; }
*
* Atomically sets the value of @v to @i.
*/
-#define atomic_set(v,i) ((v)->counter = (i))
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
/*
* atomic_add - add integer to atomic variable
@@ -404,7 +410,7 @@ static __inline__ int atomic_add_unless(
#ifdef CONFIG_64BIT
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
#define ATOMIC64_INIT(i) { (i) }
@@ -413,14 +419,20 @@ typedef struct { volatile long counter;
* @v: pointer of type atomic64_t
*
*/
-#define atomic64_read(v) ((v)->counter)
+static __inline__ long atomic64_read(atomic64_t *v)
+{
+ return *(volatile long *)&v->counter;
+}
/*
* atomic64_set - set atomic variable
* @v: pointer of type atomic64_t
* @i: required value
*/
-#define atomic64_set(v,i) ((v)->counter = (i))
+static __inline__ void atomic64_set(atomic64_t *v, long i)
+{
+ *(volatile long *)&v->counter = i;
+}
/*
* atomic64_add - add integer to atomic variable
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on parisc.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-parisc/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-parisc/atomic.h 2007-08-13 05:59:35.000000000 -0400
@@ -128,7 +128,7 @@ __cmpxchg(volatile void *ptr, unsigned l
* Cache-line alignment would conflict with, for example, linux/module.h
*/
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
/* It's possible to reduce all atomic operations to either
* __atomic_add_return, atomic_set and atomic_read (the latter
@@ -159,7 +159,7 @@ static __inline__ void atomic_set(atomic
static __inline__ int atomic_read(const atomic_t *v)
{
- return v->counter;
+ return *(volatile int *)&v->counter;
}
/* exported interface */
@@ -227,7 +227,7 @@ static __inline__ int atomic_add_unless(
#ifdef CONFIG_64BIT
-typedef struct { volatile s64 counter; } atomic64_t;
+typedef struct { s64 counter; } atomic64_t;
#define ATOMIC64_INIT(i) ((atomic64_t) { (i) })
@@ -258,7 +258,7 @@ atomic64_set(atomic64_t *v, s64 i)
static __inline__ s64
atomic64_read(const atomic64_t *v)
{
- return v->counter;
+ return *(volatile s64 *)&v->counter;
}
#define atomic64_add(i,v) ((void)(__atomic64_add_return( ((s64)i),(v))))
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on s390.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-s390/atomic.h 2007-08-13 03:14:13.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-s390/atomic.h 2007-08-13 06:04:58.000000000 -0400
@@ -67,8 +67,15 @@ typedef struct {
#endif /* __GNUC__ */
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v,i) (((v)->counter) = (i))
+static __inline__ int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
static __inline__ int atomic_add_return(int i, atomic_t * v)
{
@@ -182,8 +189,15 @@ typedef struct {
#endif /* __GNUC__ */
-#define atomic64_read(v) ((v)->counter)
-#define atomic64_set(v,i) (((v)->counter) = (i))
+static __inline__ long long atomic64_read(atomic64_t *v)
+{
+ return *(volatile long long *)&v->counter;
+}
+
+static __inline__ void atomic64_set(atomic64_t *v, long long i)
+{
+ *(volatile long long *)&v->counter = i;
+}
static __inline__ long long atomic64_add_return(long long i, atomic64_t * v)
{
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on sh64.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-sh64/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-sh64/atomic.h 2007-08-13 06:08:37.000000000 -0400
@@ -19,12 +19,19 @@
*
*/
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) ( (atomic_t) { (i) } )
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v,i) ((v)->counter = (i))
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
#include <asm/system.h>
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on sh.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-sh/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-sh/atomic.h 2007-08-13 06:07:16.000000000 -0400
@@ -7,12 +7,19 @@
*
*/
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
#define ATOMIC_INIT(i) ( (atomic_t) { (i) } )
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v,i) ((v)->counter = (i))
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
#include <linux/compiler.h>
#include <asm/system.h>
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on sparc64.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-sparc64/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-sparc64/atomic.h 2007-08-13 06:17:01.000000000 -0400
@@ -11,17 +11,31 @@
#include <linux/types.h>
#include <asm/system.h>
-typedef struct { volatile int counter; } atomic_t;
-typedef struct { volatile __s64 counter; } atomic64_t;
+typedef struct { int counter; } atomic_t;
+typedef struct { __s64 counter; } atomic64_t;
#define ATOMIC_INIT(i) { (i) }
#define ATOMIC64_INIT(i) { (i) }
-#define atomic_read(v) ((v)->counter)
-#define atomic64_read(v) ((v)->counter)
+static __inline__ int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static __inline__ __s64 atomic64_read(atomic64_t *v)
+{
+ return *(volatile __s64 *)&v->counter;
+}
-#define atomic_set(v, i) (((v)->counter) = i)
-#define atomic64_set(v, i) (((v)->counter) = i)
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
+
+static __inline__ void atomic64_set(atomic64_t *v, __s64 i)
+{
+ *(volatile __s64 *)&v->counter = i;
+}
extern void atomic_add(int, atomic_t *);
extern void atomic64_add(int, atomic64_t *);
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on alpha.
Leave sparc-internal atomic24_t type alone.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-sparc/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-sparc/atomic.h 2007-08-13 06:12:49.000000000 -0400
@@ -13,7 +13,7 @@
#include <linux/types.h>
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
#ifdef __KERNEL__
@@ -61,7 +61,10 @@ extern int atomic_cmpxchg(atomic_t *, in
extern int atomic_add_unless(atomic_t *, int, int);
extern void atomic_set(atomic_t *, int);
-#define atomic_read(v) ((v)->counter)
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
#define atomic_add(i, v) ((void)__atomic_add_return( (int)(i), (v)))
#define atomic_sub(i, v) ((void)__atomic_add_return(-(int)(i), (v)))
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on v850.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-v850/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-v850/atomic.h 2007-08-13 06:19:32.000000000 -0400
@@ -27,8 +27,15 @@ typedef struct { int counter; } atomic_t
#ifdef __KERNEL__
-#define atomic_read(v) ((v)->counter)
-#define atomic_set(v,i) (((v)->counter) = (i))
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
static inline int atomic_add_return (int i, volatile atomic_t *v)
{
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on x86_64.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-x86_64/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-x86_64/atomic.h 2007-08-13 06:22:43.000000000 -0400
@@ -32,7 +32,10 @@ typedef struct { int counter; } atomic_t
*
* Atomically reads the value of @v.
*/
-#define atomic_read(v) ((v)->counter)
+static __inline__ int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
/**
* atomic_set - set atomic variable
@@ -41,7 +44,10 @@ typedef struct { int counter; } atomic_t
*
* Atomically sets the value of @v to @i.
*/
-#define atomic_set(v,i) (((v)->counter) = (i))
+static __inline__ void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
/**
* atomic_add - add integer to atomic variable
@@ -206,7 +212,7 @@ static __inline__ int atomic_sub_return(
/* An 64bit atomic type */
-typedef struct { volatile long counter; } atomic64_t;
+typedef struct { long counter; } atomic64_t;
#define ATOMIC64_INIT(i) { (i) }
@@ -217,7 +223,10 @@ typedef struct { volatile long counter;
* Atomically reads the value of @v.
* Doesn't imply a read memory barrier.
*/
-#define atomic64_read(v) ((v)->counter)
+static __inline__ long atomic64_read(atomic64_t *v)
+{
+ return *(volatile long *)&v->counter;
+}
/**
* atomic64_set - set atomic64 variable
@@ -226,7 +235,10 @@ typedef struct { volatile long counter;
*
* Atomically sets the value of @v to @i.
*/
-#define atomic64_set(v,i) (((v)->counter) = (i))
+static __inline__ void atomic64_set(atomic64_t *v, long i)
+{
+ *(volatile long *)&v->counter = i;
+}
/**
* atomic64_add - add integer to atomic64 variable
-
From: Chris Snook <csnook@redhat.com>
Use volatile consistently in atomic.h on xtensa.
Signed-off-by: Chris Snook <csnook@redhat.com>
--- linux-2.6.23-rc3-orig/include/asm-xtensa/atomic.h 2007-07-08 19:32:17.000000000 -0400
+++ linux-2.6.23-rc3/include/asm-xtensa/atomic.h 2007-08-13 06:31:58.000000000 -0400
@@ -15,7 +15,7 @@
#include <linux/stringify.h>
-typedef struct { volatile int counter; } atomic_t;
+typedef struct { int counter; } atomic_t;
#ifdef __KERNEL__
#include <asm/processor.h>
@@ -47,7 +47,10 @@ typedef struct { volatile int counter; }
*
* Atomically reads the value of @v.
*/
-#define atomic_read(v) ((v)->counter)
+static inline int atomic_read(atomic_t *v)
+{
+ return *(volatile int *)&v->counter;
+}
/**
* atomic_set - set atomic variable
@@ -56,7 +59,10 @@ typedef struct { volatile int counter; }
*
* Atomically sets the value of @v to @i.
*/
-#define atomic_set(v,i) ((v)->counter = (i))
+static inline void atomic_set(atomic_t *v, int i)
+{
+ *(volatile int *)&v->counter = i;
+}
/**
* atomic_add - add integer to atomic variable
-
