On Fri, 19 Oct 2007, Nick Piggin wrote:Its expensive on IA64. Is it any less expensive on x86? ia64-lock-bitops.patch defines: static __inline__ void clear_bit_unlock (int nr, volatile void *addr) { __u32 mask, old, new; volatile __u32 *m; CMPXCHG_BUGCHECK_DECL m = (volatile __u32 *) addr + (nr >> 5); mask = ~(1 << (nr & 31)); do { CMPXCHG_BUGCHECK(m); old = *m; new = old & mask; } while (cmpxchg_rel(m, old, new) != old); } /** * __clear_bit_unlock - Non-atomically clear a bit with release * * This is like clear_bit_unlock, but the implementation may use a non-atomic * store (this one uses an atomic, however). */ #define __clear_bit_unlock clear_bit_unlock A non atomic store is a misaligned store on IA64. That is not relevant here. The data is properly aligned. I guess it was intended to refer to the cmpxchg. How about this patch? [Works fine on IA64 simulator...] IA64: Slim down __clear_bit_unlock __clear_bit_unlock does not need to perform atomic operations on the variable. Avoid a cmpxchg and simply do a store with release semantics. Add a barrier to be safe that the compiler does not do funky things. Signed-off-by: Christoph Lameter <clameter@sgi.com> --- include/asm-ia64/bitops.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) Index: linux-2.6.23-mm1/include/asm-ia64/bitops.h =================================================================== --- linux-2.6.23-mm1.orig/include/asm-ia64/bitops.h 2007-10-18 19:37:22.000000000 -0700 +++ linux-2.6.23-mm1/include/asm-ia64/bitops.h 2007-10-18 19:50:22.000000000 -0700 @@ -124,10 +124,21 @@ clear_bit_unlock (int nr, volatile void /** * __clear_bit_unlock - Non-atomically clear a bit with release * - * This is like clear_bit_unlock, but the implementation may use a non-atomic - * store (this one uses an atomic, however). + * This is like clear_bit_unlock, but the implementation uses a store + * with release semantics. See also __raw_spin_unlock(). */ -#define __clear_bit_unlock clear_bit_unlock +static __inline__ void +__clear_bit_unlock (int nr, volatile void *addr) +{ + __u32 mask, new; + volatile __u32 *m; + + m = (volatile __u32 *) addr + (nr >> 5); + mask = ~(1 << (nr & 31)); + new = *m & mask; + barrier(); + asm volatile ("st4.rel.nta [%0] = %1\n\t" :: "r"(m), "r"(new)); +} /** * __clear_bit - Clears a bit in memory (non-atomic version) -
| debian developer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| H. Peter Anvin | Re: [PATCH] x86: Construct 32 bit boot time page tables in native format. |
| Christoph Lameter | Re: [RFC 00/15] x86_64: Optimize percpu accesses |
git: | |
| Christoph Hellwig | Re: [PATCH 06/32] IGET: Mark iget() and read_inode() as being obsolete [try #2] |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | [GIT]: Networking |
