Among other things, this fixes the rwsem signedness issue we
were discussing earlier today.
Add entries for the new 2.6.36 system calls.
Make console=tty* on the kernel command line work properly
for serial consoles.
Please pull, thanks a lot!
The following changes since commit da5cabf80e2433131bf0ed8993abc0f7ea618c73:
Linux 2.6.36-rc1 (2010-08-15 17:41:37 -0700)
are available in the git repository at:
master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6.git master
David S. Miller (5):
sparc: Really fix "console=" for serial consoles.
Merge branch 'master' of git://git.kernel.org/.../torvalds/linux-2.6
sparc: Hook up new fanotify and prlimit64 syscalls.
sparc64: Fix rwsem constant bug leading to hangs.
sparc64: Fix atomic64_t routine return values.
arch/sparc/include/asm/atomic_64.h | 6 +++---
arch/sparc/include/asm/fb.h | 4 ++++
arch/sparc/include/asm/rwsem-const.h | 2 +-
arch/sparc/include/asm/unistd.h | 5 ++++-
arch/sparc/kernel/sys32.S | 9 +++++++++
arch/sparc/kernel/systbls_32.S | 3 ++-
arch/sparc/kernel/systbls_64.S | 6 ++++--
drivers/serial/suncore.c | 15 +++++++++------
8 files changed, 36 insertions(+), 14 deletions(-)
--
Your commit message is missing the C rules for hex constants. It says
"hex constants are unsigned unless explicitly casted or negated."
and that's not true.
The rule is that hex constants are signed _except_ if they don't fit
in a signed.
So with a 32-bit 'int', 0x123 is signed, but 0x80000000 is unsigned.
So the reason (-0x00010000) is signed is _not_ because of the
negation, but simply because 0x00010000 fits in a signed int. So for
example, the constant (-0xf0000000) is still unsigned, despite the
negation.
So to make something signed, you need to either cast it, make sure it
fits in a signed int, use the 'l' postfix (which also makes it long,
of course), or use a decimal representation. So
#define X 4294901760
is a _signed_ constant with same value as 0xffff0000 (but it's "signed
long", because the rules for decimal numbers and hex numbers are
different: a decimal number is always signed and because it doesn't
fit in 'int' it will extend to 'long'. A hex number is first done as
unsigned, and only extended to long if it doesn't fit in that.
To make things _really_ confused, sometimes the types actually depend
on whether you're compiling with the c90 standards. A decimal constant
is _always_ signed in traditional C - it goes from 'int' to 'long',
and stays 'long' even if it doesn't fit (ie with a 32-bit long,
2147483648 is of type 'long' even though it doesn't fit in 'long' and
is negative). But in c90, it does from 'int' to 'long' to 'unsigned
long'.
Or maybe it was the other way around. I forget.
Confused yet?
The basic rule becomes: never _ever_ overflow 'int' in a constant,
without specifying the exact type you want. That way you avoid all the
subtle cases.
Linus
--
Oh,
I noticed another thing:
In commit 86fa04b8742ac681d470786f55e2403ada0075b2 you fix the return
type, but you still have the wrong _argument_ type:
extern void atomic64_add(int, atomic64_t *);
extern void atomic64_sub(int, atomic64_t *);
extern long atomic64_add_ret(int, atomic64_t *);
extern long atomic64_sub_ret(int, atomic64_t *);
note how if somebody does
atomic64_add(0x100000000ull, &x)
sparc64 will get it wrong, because it will only take the low 32 bits
of the first argument, and add zero to the 64-bit counter.
Which is definitely not what the code intended, I think.
I merged your pull request, but you've got some fixing up to do,
methinks. I also really think you need to make your rwsem's use 64-bit
values on sparc64, because otherwise you can overflow the mmap_sem by
having more than 65536 threads doing page-faults (on 32-bit, having
more than 2**16 threads in one process is unlikely to work for other
reasons, like just pure stack usage, so we don't really care about the
32-bit case)
Linus
--
From: Linus Torvalds <torvalds@linux-foundation.org> I have a patch to do this already, just need to test it. You should bug the powerpc folks too :-) --
32K threads :-) you guys are nuts !
Here's an untested patch for the folks on linuxppc-dev to look at, I'll
review my own stuff & test tomorrow.
Cheers,
Ben.
powerpc: Make rwsem use "long" types on 64-bit platforms
This should avoid overflow of the mmap_sem when playing with insane
number of threads.
Not-signed-off-by-yet.
diff --git a/arch/powerpc/include/asm/rwsem.h b/arch/powerpc/include/asm/rwsem.h
index 24cd928..ca64a98 100644
--- a/arch/powerpc/include/asm/rwsem.h
+++ b/arch/powerpc/include/asm/rwsem.h
@@ -21,15 +21,20 @@
/*
* the semaphore definition
*/
-struct rw_semaphore {
- /* XXX this should be able to be an atomic_t -- paulus */
- signed int count;
-#define RWSEM_UNLOCKED_VALUE 0x00000000
-#define RWSEM_ACTIVE_BIAS 0x00000001
-#define RWSEM_ACTIVE_MASK 0x0000ffff
-#define RWSEM_WAITING_BIAS (-0x00010000)
+#ifdef CONFIG_PPC64
+# define RWSEM_ACTIVE_MASK 0xffffffffL
+#else
+# define RWSEM_ACTIVE_MASK 0x0000ffffL
+#endif
+
+#define RWSEM_UNLOCKED_VALUE 0x00000000L
+#define RWSEM_ACTIVE_BIAS 0x00000001L
+#define RWSEM_WAITING_BIAS (-RWSEM_ACTIVE_MASK-1)
#define RWSEM_ACTIVE_READ_BIAS RWSEM_ACTIVE_BIAS
#define RWSEM_ACTIVE_WRITE_BIAS (RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+
+struct rw_semaphore {
+ atomic_long_t count;
spinlock_t wait_lock;
struct list_head wait_list;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -43,9 +48,13 @@ struct rw_semaphore {
# define __RWSEM_DEP_MAP_INIT(lockname)
#endif
-#define __RWSEM_INITIALIZER(name) \
- { RWSEM_UNLOCKED_VALUE, __SPIN_LOCK_UNLOCKED((name).wait_lock), \
- LIST_HEAD_INIT((name).wait_list) __RWSEM_DEP_MAP_INIT(name) }
+#define __RWSEM_INITIALIZER(name) \
+{ \
+ ATOMIC_LONG_INIT(RWSEM_UNLOCKED_VALUE), \
+ __SPIN_LOCK_UNLOCKED((name).wait_lock), \
+ LIST_HEAD_INIT((name).wait_list) \
+ __RWSEM_DEP_MAP_INIT(name) \
+}
#define DECLARE_RWSEM(name) \
struct rw_semaphore name = __RWSEM_INITIALIZER(name)
@@ -70,16 +79,16 @@ extern void ...Allright, gcc's being a pain, and atomics are a struct so we can't that easily assign. I tried various tricks but so far they didn't work. I'll have another look tomorrow, but I may end up having to keep all the crap typecasts. Cheers, Ben. --
From: Benjamin Herrenschmidt <benh@kernel.crashing.org> The casts are pretty much unavoidable. Here's what I'm going to end up using on sparc64: -------------------- sparc64: Make rwsems 64-bit. Basically tip-off the powerpc code, use a 64-bit type and atomic64_t interfaces for the implementation. This gets us off of the by-hand asm code I wrote, which frankly I think probably ruins I-cache hit rates. The idea was the keep the call chains less deep, but anything taking the rw-semaphores probably is also calling other stuff and therefore already has allocated a stack-frame. So no real stack frame savings ever. Ben H. has posted patches to make powerpc use 64-bit too and with some abstractions we can probably use a shared header file somewhere. Signed-off-by: David S. Miller <davem@davemloft.net> --- arch/sparc/include/asm/rwsem-const.h | 12 --- arch/sparc/include/asm/rwsem.h | 120 +++++++++++++++++++++---- arch/sparc/lib/Makefile | 2 +- arch/sparc/lib/rwsem_64.S | 163 ---------------------------------- 4 files changed, 104 insertions(+), 193 deletions(-) delete mode 100644 arch/sparc/include/asm/rwsem-const.h delete mode 100644 arch/sparc/lib/rwsem_64.S diff --git a/arch/sparc/include/asm/rwsem-const.h b/arch/sparc/include/asm/rwsem-const.h deleted file mode 100644 index e4c61a1..0000000 --- a/arch/sparc/include/asm/rwsem-const.h +++ /dev/null @@ -1,12 +0,0 @@ -/* rwsem-const.h: RW semaphore counter constants. */ -#ifndef _SPARC64_RWSEM_CONST_H -#define _SPARC64_RWSEM_CONST_H - -#define RWSEM_UNLOCKED_VALUE 0x00000000 -#define RWSEM_ACTIVE_BIAS 0x00000001 -#define RWSEM_ACTIVE_MASK 0x0000ffff -#define RWSEM_WAITING_BIAS (-0x00010000) -#define RWSEM_ACTIVE_READ_BIAS RWSEM_ACTIVE_BIAS -#define RWSEM_ACTIVE_WRITE_BIAS (RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS) - -#endif /* _SPARC64_RWSEM_CONST_H */ diff --git a/arch/sparc/include/asm/rwsem.h b/arch/sparc/include/asm/rwsem.h index 6e56210..a2b4302 ...
Similar here, but using atomic_long_t instead so it works for 32-bit too
for me. I suppose we could make that part common indeed.
What about asm-generic/rwsem-atomic.h or rwsem-cmpxchg.h ?
Below is my current patch, seems to boot fine here so far.
Cheers,
Ben
Subject: [PATCH] powerpc: Make rwsem use "long" type
This makes the 64-bit kernel use 64-bit signed integers for the counter
(effectively supporting 32-bit of active count in the semaphore), thus
avoiding things like overflow of the mmap_sem if you use a really crazy
number of threads
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
arch/powerpc/include/asm/rwsem.h | 64 ++++++++++++++++++++++----------------
1 files changed, 37 insertions(+), 27 deletions(-)
diff --git a/arch/powerpc/include/asm/rwsem.h b/arch/powerpc/include/asm/rwsem.h
index 24cd928..8447d89 100644
--- a/arch/powerpc/include/asm/rwsem.h
+++ b/arch/powerpc/include/asm/rwsem.h
@@ -21,15 +21,20 @@
/*
* the semaphore definition
*/
-struct rw_semaphore {
- /* XXX this should be able to be an atomic_t -- paulus */
- signed int count;
-#define RWSEM_UNLOCKED_VALUE 0x00000000
-#define RWSEM_ACTIVE_BIAS 0x00000001
-#define RWSEM_ACTIVE_MASK 0x0000ffff
-#define RWSEM_WAITING_BIAS (-0x00010000)
+#ifdef CONFIG_PPC64
+# define RWSEM_ACTIVE_MASK 0xffffffffL
+#else
+# define RWSEM_ACTIVE_MASK 0x0000ffffL
+#endif
+
+#define RWSEM_UNLOCKED_VALUE 0x00000000L
+#define RWSEM_ACTIVE_BIAS 0x00000001L
+#define RWSEM_WAITING_BIAS (-RWSEM_ACTIVE_MASK-1)
#define RWSEM_ACTIVE_READ_BIAS RWSEM_ACTIVE_BIAS
#define RWSEM_ACTIVE_WRITE_BIAS (RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+
+struct rw_semaphore {
+ long count;
spinlock_t wait_lock;
struct list_head wait_list;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -43,9 +48,13 @@ struct rw_semaphore {
# define __RWSEM_DEP_MAP_INIT(lockname)
#endif
-#define __RWSEM_INITIALIZER(name) \
- { RWSEM_UNLOCKED_VALUE, __SPIN_LOCK_UNLOCKED((name).wait_lock), \
- ...From: Benjamin Herrenschmidt <benh@kernel.crashing.org> Using rwsem-cmpxchg.h sounds best I guess. --
Ok, I'll send a new patch tomorrow that does that. Cheers, Ben. --
This makes the 64-bit kernel use 64-bit signed integers for the counter
(effectively supporting 32-bit of active count in the semaphore), thus
avoiding things like overflow of the mmap_sem if you use a really crazy
number of threads
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
arch/powerpc/include/asm/rwsem.h | 64 ++++++++++++++++++++++----------------
1 files changed, 37 insertions(+), 27 deletions(-)
diff --git a/arch/powerpc/include/asm/rwsem.h b/arch/powerpc/include/asm/rwsem.h
index 24cd928..8447d89 100644
--- a/arch/powerpc/include/asm/rwsem.h
+++ b/arch/powerpc/include/asm/rwsem.h
@@ -21,15 +21,20 @@
/*
* the semaphore definition
*/
-struct rw_semaphore {
- /* XXX this should be able to be an atomic_t -- paulus */
- signed int count;
-#define RWSEM_UNLOCKED_VALUE 0x00000000
-#define RWSEM_ACTIVE_BIAS 0x00000001
-#define RWSEM_ACTIVE_MASK 0x0000ffff
-#define RWSEM_WAITING_BIAS (-0x00010000)
+#ifdef CONFIG_PPC64
+# define RWSEM_ACTIVE_MASK 0xffffffffL
+#else
+# define RWSEM_ACTIVE_MASK 0x0000ffffL
+#endif
+
+#define RWSEM_UNLOCKED_VALUE 0x00000000L
+#define RWSEM_ACTIVE_BIAS 0x00000001L
+#define RWSEM_WAITING_BIAS (-RWSEM_ACTIVE_MASK-1)
#define RWSEM_ACTIVE_READ_BIAS RWSEM_ACTIVE_BIAS
#define RWSEM_ACTIVE_WRITE_BIAS (RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+
+struct rw_semaphore {
+ long count;
spinlock_t wait_lock;
struct list_head wait_list;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -43,9 +48,13 @@ struct rw_semaphore {
# define __RWSEM_DEP_MAP_INIT(lockname)
#endif
-#define __RWSEM_INITIALIZER(name) \
- { RWSEM_UNLOCKED_VALUE, __SPIN_LOCK_UNLOCKED((name).wait_lock), \
- LIST_HEAD_INIT((name).wait_list) __RWSEM_DEP_MAP_INIT(name) }
+#define __RWSEM_INITIALIZER(name) \
+{ \
+ RWSEM_UNLOCKED_VALUE, \
+ __SPIN_LOCK_UNLOCKED((name).wait_lock), \
+ LIST_HEAD_INIT((name).wait_list) \
+ __RWSEM_DEP_MAP_INIT(name) \
+}
#define DECLARE_RWSEM(name) \
struct rw_semaphore ...On Fri, Aug 20, 2010 at 12:14 AM, Benjamin Herrenschmidt Does this change make the code do anything different? -- Timur Tabi Linux kernel developer at Freescale --
Other architectures who support cmpxchg and atomic_long can
use that directly.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
arch/powerpc/include/asm/rwsem.h | 184 +----------------------------------
include/asm-generic/rwsem-cmpxchg.h | 183 ++++++++++++++++++++++++++++++++++
2 files changed, 184 insertions(+), 183 deletions(-)
create mode 100644 include/asm-generic/rwsem-cmpxchg.h
diff --git a/arch/powerpc/include/asm/rwsem.h b/arch/powerpc/include/asm/rwsem.h
index 8447d89..1237ad6 100644
--- a/arch/powerpc/include/asm/rwsem.h
+++ b/arch/powerpc/include/asm/rwsem.h
@@ -1,183 +1 @@
-#ifndef _ASM_POWERPC_RWSEM_H
-#define _ASM_POWERPC_RWSEM_H
-
-#ifndef _LINUX_RWSEM_H
-#error "Please don't include <asm/rwsem.h> directly, use <linux/rwsem.h> instead."
-#endif
-
-#ifdef __KERNEL__
-
-/*
- * R/W semaphores for PPC using the stuff in lib/rwsem.c.
- * Adapted largely from include/asm-i386/rwsem.h
- * by Paul Mackerras <paulus@samba.org>.
- */
-
-#include <linux/list.h>
-#include <linux/spinlock.h>
-#include <asm/atomic.h>
-#include <asm/system.h>
-
-/*
- * the semaphore definition
- */
-#ifdef CONFIG_PPC64
-# define RWSEM_ACTIVE_MASK 0xffffffffL
-#else
-# define RWSEM_ACTIVE_MASK 0x0000ffffL
-#endif
-
-#define RWSEM_UNLOCKED_VALUE 0x00000000L
-#define RWSEM_ACTIVE_BIAS 0x00000001L
-#define RWSEM_WAITING_BIAS (-RWSEM_ACTIVE_MASK-1)
-#define RWSEM_ACTIVE_READ_BIAS RWSEM_ACTIVE_BIAS
-#define RWSEM_ACTIVE_WRITE_BIAS (RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
-
-struct rw_semaphore {
- long count;
- spinlock_t wait_lock;
- struct list_head wait_list;
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
- struct lockdep_map dep_map;
-#endif
-};
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define __RWSEM_DEP_MAP_INIT(lockname) , .dep_map = { .name = #lockname }
-#else
-# define __RWSEM_DEP_MAP_INIT(lockname)
-#endif
-
-#define ...#ifdef __KERNEL__ is only relevant for exported headers. For kernel only headers like this is does not make sense, but it does not harm. Sam --
Well, it was there in the first place, I think we've carried around forever, but as you say, it doesn't really harm. Ben. --
From: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David S. Miller <davem@davemloft.net> I'll move sparc64 over to this once it hits Linus's tree. --
The implementation looks good for asm-generic, but there is now an asymmetry between the spinlock and the atomic_long_t based version. Maybe we can make them both do the same thing, either of 1. create include/linux/rwsem-cmpxchg.h and add an #elif defined(CONFIG_RWSEM_GENERIC_ATOMIC) to include/linux/rwsem.h 2. move include/linux/rwsem-spinlock.h to include/asm-generic/ and include that from all architectures that want the spinlock based version. Further comments: * Alpha has an optimization for the uniprocessor case, where the atomic instructions get turned into nonatomic additions. The spinlock based version uses no locks on UP but disables interrupts for reasons I don't understand (nothing running at interrupt time should try to access an rwsem). Should the generic version do the same as Alpha? * Is there any architecture that would still benefit from having a separate rwsem implementation? AFAICT all the remaining ones are just variations of the same concept of using cmpxchg (or xadd in case of x86), which is what atomics typically end up doing anyway. Arnd --
I've seen drivers in the past do trylocks at interrupt time ... tho I It depends how sensitive rwsems are. The "generic" variant based on atomic's and cmpxchg on powerpc is sub-optimal in the sense that it has stronger memory barriers that would be necessary (atomic_inc_return for example has both acquire and release). But that vs. one more pile of inline asm, we decided it wasn't hot enough a spot for us to care back then. Cheers, Ben. --
From: Benjamin Herrenschmidt <benh@kernel.crashing.org> Recently there was a thread where this was declared absolutely illegal. Maybe it was allowed, or sort-of worked before, and that's why it's accounted for with IRQ disables in some implementations. I don't know. --
Ok, I'm happy to say it's a big no-no then. Arnd, do you want to take over the moving to asm-generic and take care of the spinlock case as well ? I can send Linus the first patch that changes powerpc to use atomic_long now along with a few other things I have pending, then you can pickup from there. Or do you want me to continue pushing my patch as-is and we can look at cleaning up the spinlock case separately ? Cheers, Ben. --
I'm currently doing too many things at once, please push in your existing patch for now, we can continue from there. For the asm-generic patch: Acked-by: Arnd Bergmann <arnd@arndb.de> --
Actually, it's not that complicated: 1) base and suffices choose the possible types. 2) order of types is always the same: int -> unsigned -> long -> unsigned long -> long long -> unsigned long long 3) we always choose the first type the value would fit into 4) L in suffix == "at least long" 5) LL in suffix == "at least long long" 6) U in suffix == "unsigned" 7) without U in suffix, base 10 == "signed" That's it. C90 differs from C99 only in one thing - long long (and LL) isn't there. The subtle mess Linus has mentioned is C90 gccism: gcc has allowed unsigned long for decimal constants, as the last resort. I.e. if you had a plain decimal constant that wouldn't fit into long but would fit into unsigned long, gcc generated a warning and treated it as unsigned long. C90 would reject the damn thing. _Bad_ extension, since in C99 the same constant would be a legitimate signed long long. But yes, "use the suffix when unsure" is a damn good idea, _especially_ since the sizeof(long) actually varies between the targets we care about. --
