[ANNOUNCE] mdb-2.6.27-rc2-ia32-08-07-08.patch

Previous thread: SKB BUG: Invalid truesize (412) len=28, ipsec related ? by Lazy on Thursday, August 7, 2008 - 7:48 am. (2 messages)

Next thread: 2.6.27-rc0 Power Bugs with HP/Compaq Laptops by Matt Parnell on Thursday, August 7, 2008 - 7:57 am. (5 messages)
From: jmerkey
Date: Thursday, August 7, 2008 - 7:29 am

The mdb-rc2 patch was posted this morning with the changes for a modular
kernel debugger using kprobes.

ftp://ftp.wolfmountaingroup.org/pub/mdb/mdb-2.6.27-rc2-ia32-08-07-08.patch

Jeffrey Vernon Merkey


--

From: Stefan Richter
Date: Thursday, August 7, 2008 - 8:49 am

Is this something along the lines of a counting semaphore?  As far as I
understand its accessor functions, an rlock
  - can be taken by one CPU multiple times,
  - will block the other CPUs as long as the first CPU hasn't unlocked
    the rlock as many times as it locked it.

The accessors rspin_lock() and rspin_try_lock() peek into spinlock_t and
may therefore not be fully portable.  Also, they and rspin_unlock()

In general, a lot can happen between the access to
rlock->lock.raw_lock.slock and the access to rlock->processor.

Even rlock->count++ in itself can go wrong.


Are these datasets shared between CPUs and do you access them without
lock protection?  (It looks like that from a quick glance.)  If yes,
instead of the volatile qualifier, can't you use memory barriers in
order to define the access semantics more clearly (and more effective)
than volatile can?

If there are interdependencies in these datasets, with which mechanisms

Not necessary, because static (and extern) variables are already
implicitly initialized to zero.
-- 
Stefan Richter
-=====-==--- =--- --===
http://arcgraph.de/sr/
--

From: Stefan Richter
Date: Thursday, August 7, 2008 - 8:59 am

Correction:  They _are_ not portable.  Look at the first hit in
http://lxr.linux.no/linux+v2.6.26/+code=raw_spinlock_t .
-- 
Stefan Richter
-=====-==--- =--- --===
http://arcgraph.de/sr/
--

From: jmerkey
Date: Thursday, August 7, 2008 - 8:56 am

OK.  One more to fix.  I have to be able to take nested page faults inside
the debugger in the event I am debugging a page fault handler or paging
event and someone's working set routine crashes.

rspin locks are for these types of cases -- so if I fault on the same
processor I took the lock on it just bumps a counter -- yes, it is atomic
and SMP safe to do it this way.  You are correct that its non-portable
but the file this is in is named mdb-IA32 and is not meant as portable to
anything other than intel anyway.

As for coding style and misuse of a raw lock structure, I agree with you
completely, and I will look for a better way to do this without exposing
this structure.



--

From: Stefan Richter
Date: Thursday, August 7, 2008 - 9:46 am

Only if all contexts which take rlocks are not preemptible.  Which I 
don't know whether they are; I'm just a driver guy.  You use 
spin_lock_irqsave() rather than plain spin_lock() though, which 
indicates that you want to be able to take the locks from preemptible 
contexts too.  In that case, your accessors are subtly buggy.
-- 
Stefan Richter
-=====-==--- =--- --===
http://arcgraph.de/sr/
--

From: jmerkey
Date: Thursday, August 7, 2008 - 9:33 am

check mdb-main.c -- I disable preemption before rspin_lock is attempted. 
Since the only processor which sets the proc number does do inside the
spin lock, and the other processors only read it, unless memory is
corrupted or the machine is severely broken, its SMP safe to this.

I use the debug_lock rspin_lock to prevent the other processors from
entering the debugger command console when one of them has the console...

Jeff

--

From: Stefan Richter
Date: Thursday, August 7, 2008 - 10:19 am

Then it is recommendable that you document the call context requirements 
at the functions.  And you can and IMO should drop the _irq_save and 
_irq_restore from the spinlock accessors in the rlock accessors.  And 
drop the volatile qualifier of the rlock accessor argument while you are 
at it.

I see that you are calling save_flags/ restore_flags in 
mdb-main.c::mdb().  These are marked as deprecated.  Would 
local_irq_save/ local_irq_restore be correct at these places?
-- 
Stefan Richter
-=====-==--- =--- --===
http://arcgraph.de/sr/
--

From: Stefan Richter
Date: Thursday, August 7, 2008 - 10:33 am

An alternative would of course be to establish the non-preemptible 
context from inside the lock accessors, either redundantly to what the 
call site is doing, or instead of doing it at the call site if that can 
be arranged.
-- 
Stefan Richter
-=====-==--- =--- --===
http://arcgraph.de/sr/
--

From: jmerkey
Date: Thursday, August 7, 2008 - 10:52 am

It needs to get turned off right away, well before we get to the locks.

Keff


--

From: jmerkey
Date: Thursday, August 7, 2008 - 10:52 am

You have sharp eyes.  Yes, you are correct.  added to the list of
corrections.



--

From: jmerkey
Date: Thursday, August 7, 2008 - 9:04 am

Another M$ legacy relict. On Microsoft C compilers (older ones) failure to
initialize global data produces undefined results since they do not always
zero out memory and global data -- they still do not in all cases.

I will remove this a well.  Still more checkpatch.pl stuff for me to do.  
am down to about 3000+ noisy messages now ...




--

From: Jeremy Fitzhardinge
Date: Friday, August 8, 2008 - 10:07 pm

Ticket locks will almost always have a non-zero slock.  It doesn't 
indicate anything about the locked/unlocked state.  But this looks like 
it's effectively doing a trylock:

	if (!spin_trylock(rlock) && rlock->processor == proc) {
		rlock->count++;
		...
	} else {
		rlock->processor = proc;
		...
	}


    J


--

From: Stefan Richter
Date: Saturday, August 9, 2008 - 1:04 am

Right.  This implemention also looks free of race conditions, provided that

   - rspin_lock, rspin_try_lock, and rspin_unlock are only called in
     contexts with disabled preemption and disabled local interrupts,

   - rspin_unlock() rewrites rlock->processor to "no CPU" before
     it drops the lock.  (The implementation in
     mdb-2.6.27-rc2-ia32-08-07-08.patch does so.)

BTW, the rspin_try_lock() in that patch wrong:  It always returns 0 
instead of having three branches of execution which return 0/1/-1.
-- 
Stefan Richter
-=====-==--- =--- -=--=
http://arcgraph.de/sr/
--

From: jmerkey
Date: Saturday, August 9, 2008 - 7:44 am

.... On linux it does -- on another OS it does something quite different. 
I changed that case last night when I added spin)is_locked calls.   Is
there
to do a "debugger bust spinlocks" if the system ever hangs in the
debugger.  probably should code it different.




--

Previous thread: SKB BUG: Invalid truesize (412) len=28, ipsec related ? by Lazy on Thursday, August 7, 2008 - 7:48 am. (2 messages)

Next thread: 2.6.27-rc0 Power Bugs with HP/Compaq Laptops by Matt Parnell on Thursday, August 7, 2008 - 7:57 am. (5 messages)