Re: [patch] generic rwsems

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: David Howells
Date: Friday, April 13, 2007 - 6:31 am

Nick Piggin <npiggin@suse.de> wrote:


No, you misunderstand me.  My preferred solution is to leave it up to the arch
and not to make it generic, though I'm not averse to providing some prepackaged
solutions for an arch to pick from if it wants to.


Agreed.  I can see why the spinlock implementation is bad on SMP.  By all means
change those cases, and reduce the spinlock implementation to an interrupt
disablement only version that may only be used on UP only.


It has happened, I believe.  People have tried having >32766 threads on a
32-bit box.  Mad it may be, but...


My original stuff was very easy to inline until Ingo got hold of it.


Look at how the counter works in the spinlock case.  With the addition of an
extra flag in the counter to indicate there's stuff waiting on the queue, you
can manipulate the counter if it appears safe to do so, otherwise you have to
fall back to the slow path and take a spin lock.

Break the counter down like this:

	0x00000000	- not locked; queue empty
	0x40000000	- locked by writer; queue empty
	0xc0000000	- locket by writer; queue occupied
	0x0nnnnnnn	- n readers; queue empty
	0x8nnnnnnn	- n readers; queue occupied

Now here's a rough guide as to how the main operations would work:

 (*) down_read of unlocked

	cmpxchg(0 -> 1) -> 0 [okay - you've got the lock]

 (*) down_read of readlocked.

	cmpxchg(0 -> 1) -> n [failed to get the lock]
	do
	  n = cmpxchg(old_n -> old_n+1)
        until n == old_n

 (*) down_read of writelocked or contented readlocked.

	cmpxchg(0 -> 1) -> 0x80000000|n  [lock contended]
	goto slowpath
	  spinlock
	  try to get lock again
	  if still contended
	    mark counter contended
	    add to queue
            spinunlock
	    sleep
          spinunlock

 (*) down_write of unlocked

	cmpxchg(0 -> 0x40000000) -> 0 [okay - you've got the lock]

 (*) down_write of locked, contended or otherwise

	cmpxchg(0 -> 0x40000000) -> nz [failed]
	goto slowpath
	  spinlock
	  try to get lock again
	  if still unavailable
	    mark counter contended
	    add to queue
            spinunlock
	    sleep
	  else
            spinunlock

 (*) up_read

        x = cmpxchg(1 -> 0)
        if x == 0
	  done
        else
           do
	     x = cmpxchg(old_x -> old_x-1)
	   until x == old_x
           if old_x == 0x80000000
	     wake_up_writer

 (*) up_write

        x = cmpxchg(0x80000000 -> 0)
        if x == 0
	  done
        else
	  wake_up_readers

You can actually do better with LL/SC here because for the initial attempt with
CMPXCHG in each case you have to guess what the numbers will be.  Furthermore,
you might actually be able to do an "atomic increment or set contention flag"
operation.

Note that the contention flag may only be set or cleared in the slow path
whilst you are holding the spinlock.

Doing down_read and up_read with CMPXCHG is a pain.  XADD or LL/SC would be
better, and LOCK INC/ADD/DEC/SUB won't work.  You can't use XADD in down_*() as
you may not change the bottom part of the counter if you're going to end up
queuing.

Actually, looking at it, it might be better to have the counter start off at
0x80000000 for "unlocked, no contention" and clear the no-contention flag when
you queue something.  That way you can check for the counter becoming 0 in the
up_*() functions as a trigger to go and invoke the slowpath.  Then you could
use LOCK DEC/SUB on i386 rather than XADD as you only need to check the Z flag.

Note there is a slight window whereby a reader can jump a writer that's
transiting between the fastpath part of down_write() and the slowpath part if
there are outstanding readers on the rwsem but nothing yet queued.

OTOH, there's a similar window in the current XADD-based rwsems as the spinlock
doesn't implement FIFO semantics, so the first to modify the count and fail to
the slowpath may not be the first to get themselves on the queue.


That's usually a function of the CPU designer.

David
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[patch] generic rwsems, Nick Piggin, (Fri Apr 13, 3:04 am)
fastcalls, was Re: [patch] generic rwsems, Christoph Hellwig, (Fri Apr 13, 3:19 am)
Re: [patch] generic rwsems, Nick Piggin, (Fri Apr 13, 3:25 am)
Re: [patch] generic rwsems, Andi Kleen, (Fri Apr 13, 3:53 am)
Re: fastcalls, was Re: [patch] generic rwsems, Nick Piggin, (Fri Apr 13, 3:59 am)
Re: [patch] generic rwsems, Nick Piggin, (Fri Apr 13, 4:20 am)
Re: [patch] generic rwsems , David Howells, (Fri Apr 13, 4:44 am)
Re: [patch] generic rwsems , David Howells, (Fri Apr 13, 5:09 am)
Re: [patch] generic rwsems, Christoph Hellwig, (Fri Apr 13, 5:13 am)
Re: [patch] generic rwsems, Nick Piggin, (Fri Apr 13, 5:43 am)
Re: [patch] generic rwsems, William Lee Irwin III, (Fri Apr 13, 6:22 am)
Re: [patch] generic rwsems , David Howells, (Fri Apr 13, 6:31 am)
Re: [patch] generic rwsems, Nick Piggin, (Fri Apr 13, 7:03 am)
Re: [patch] generic rwsems, Eric Dumazet, (Fri Apr 13, 7:29 am)
Re: [patch] generic rwsems , David Howells, (Fri Apr 13, 7:49 am)