>> It's simple to reproduce. Take away the volatile declaration for the
>> rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
>> references and watch the thing lock up in SMP with multiple processors
>> in
>> the debugger each stuck with their own local copy of debug_lock.
>
> I'm having a quick look at mdb-2.6.27-rc4-ia32-08-20-08.patch at the
> moment. Speaking of debug_lock()...:
>
> You use spin_trylock_irqsave((spinlock_t *)&rlock->lock, rlock->flags)
> in there. Minor nit: The pointer type cast is unnecessary.
>
> Major problem: rlock->flags is wrong in this call. Use an on-stack
> flags variable for the initial spin_trylock_irqsave. Ditto in the
> following call of spin_trylock_irqsave.
>
> Next major problem with debug_lock() and debug_unlock(): The reference
> counting doesn't work. You need an atomic_t counter. Have a look at
> the struct kref accessors for example, or even make use of the kref API.
> Or if it isn't feasible to fix with atomic_t, add a second spinlock to
> rlock_t to ensure integrity of .count (and of the .processor if
> necessary).
>
> Furthermore, I have doubts about the loop which is entered by CPU B
> while CPU A holds the rlock. You are fully aware that atomic_read(a) &&
> !atomic_read(b) in its entirety is not atomic, I hope.
>
> All this aside, I don't see *anything* in debug_lock and _unlock which
> would necessitate volatile. Well, volatile might have papered over some
> of these bugs.
>
> PS:
> Try to cut down on #if/#endif clutter. It should be possible to reduce
> them at least in .c files; .h are a different matter. For example,
> #if MDB_DEBUG_DEBUGGER
> DBGPrint("something");
> #endif
> can be trivially reduced to
> dbg_mdb_printk("something");
> where dbg_mdb_printk() is defined as an inline function which does
> nothing when MDB_DEBUG_DEBUGGER is false.
>
> PS2:
> Why are there this many debug printks anyway?
> --
> Stefan Richter
> -=====-==--- =--- =-=-=
>
http://arcgraph.de/sr/
>