Re: [RFC] mmiotrace full patch, preview 1

Previous thread: none

Next thread: Re: IDE cdrom problem with PLEXTOR DVDR PX-608AL by Borislav Petkov on Sunday, February 24, 2008 - 10:34 am. (10 messages)
From: Pekka Paalanen
Date: Sunday, February 24, 2008 - 10:03 am

Hi all,

I have finally got kmmio.c into shape so that it is built in when mmiotrace
is built. I'd like to hear your comments on kmmio.c.

kmmio.c handles the list of mmio probes with callbacks, list of traced
pages, and attaching into the page fault handler and die notifier. It
arms, traps and disarms the given pages, this is the core of mmiotrace.

mmio-mod.c is a user interface, hooking into ioremap functions and
registering the mmio probes. It also decodes the required information
from trapped mmio accesses via the pre and post callbacks in each probe.
Currently, hooking into ioremap functions works by redefining the symbols
of the target (binary) kernel module, so that it calls the traced
versions of the functions.

The most notable changes done since the last discussion are:
- kmmio.c is a built-in, not part of the module
- direct call from fault.c to kmmio.c, removing all dynamic hooks
- prepare for unregistering probes at any time
- make kmmio re-initializable and accessible to more than one user
- rewrite kmmio locking to remove all spinlocks from page fault path

Can I abuse call_rcu() like I do in kmmio.c:unregister_kmmio_probe()
or is there a better way?

The function called via call_rcu() itself calls call_rcu() again,
will this work or break? There I need a second grace period for RCU
after the first grace period for page faults.

Mmiotrace itself (mmio-mod.c) is still a module, I am going to attack
that next. At some point I will start looking into how to make mmiotrace
a tracer component of ftrace (thanks for the hint, Ingo). Ftrace should
make the user space part of mmiotracing as simple as
'cat /debug/trace/mmio > dump.txt'.

Ftrace lives in sched-devel
http://people.redhat.com/mingo/sched-devel.git/README
and I have nothing for that yet.

This patch is against torvalds/linux-2.6.git master.

Home page of the out-of-tree version:
http://nouveau.freedesktop.org/wiki/MmioTrace


Thanks, Pekka.

 arch/x86/Kconfig.debug      |   33 +++
 ...
From: Sam Ravnborg
Date: Sunday, February 24, 2008 - 10:59 am

Hi Pekka.

n is default so no need to be explicit.
If you prefer being explcit then use:




	Sam
--

From: Andrew Morton
Date: Monday, February 25, 2008 - 3:49 pm

Please feed the diff through scritps/checkpatch.pl and consider addressing


It's a semaphore.  Please do convert it to a mutex.

Andy, I'd say that addition of new semaphores is worth a warning - they're
rarely legitimate.

--

From: Christoph Hellwig
Date: Monday, February 25, 2008 - 4:34 pm

I'm not sure that any semaphore should be a warning, but the initializer
for semaphore used as binary mutex (DECLARE_MUTEX and init_MUTEX) are
worth it.

--

From: Pavel Roskin
Date: Monday, February 25, 2008 - 7:42 pm

It looks like a mutex, it acts like a mutex, but it isn't a mutex,  
it's a trap for the unwary.  Weird.  I was annoyed by it before; now I  
see a fellow developer actually getting into that trap.

I'd say, rename DECLARE_MUTEX to DECLARE_SEMAPHORE and let external  
code be fixed one way or another (i.e. stick with the "mutex" name or  
stick with the semaphore functionality if it's really needed).

-- 
Regards,
Pavel Roskin
--

From: Andy Whitcroft
Date: Tuesday, February 26, 2008 - 1:57 am

I like the fact that in evey architecture its defined as:

	#define DECLARE_MUTEX(name) __DECLARE_SEMAPHORE_GENERIC(name,1)

-apw
--

From: Christoph Hellwig
Date: Tuesday, February 26, 2008 - 10:10 am

It's a semaphore used as mutex.  Until we got struct mutex this was
perfectly fine and now we're phasing it out and will hopefully get rid
of it soon.  It just takes some time to convert all users.

--

From: Andy Whitcroft
Date: Tuesday, February 26, 2008 - 3:21 am

Ok, so that would be the following, work for everyone?

WARNING: mutexes are preferred for single holder semaphores
#1: FILE: Z95.c:1:
+       DECLARE_MUTEX(&foo);

WARNING: mutexes are preferred for single holder semaphores
#3: FILE: Z95.c:3:
+       init_MUTEX(&foo);

-apw
--

From: Ingo Molnar
Date: Tuesday, February 26, 2008 - 3:49 am

yeah.

  Acked-by: Ingo Molnar <mingo@elte.hu>

also i guess init_MUTEX_LOCKED() should emit a "this should be a 
completion" warning.

i guess non-DEFINE_SPINLOCK old-style spinlock definition:

  spinlock_t lock = SPIN_LOCK_UNLOCKED;

should emit a 'use DEFINE_SPINLOCK' warning as well?

	Ingo
--

From: Andy Whitcroft
Date: Tuesday, February 26, 2008 - 8:20 am

Thats easy enough.  Though your tone here implies its less definatly
wrong than the other use forms.  Do we want gentle language here?


Those (SPIN_LOCK_UNLOCKED & RW_LOCK_UNLOCKED) we already pick up and
indicate are deprecated.

-apw
--

From: Pekka Paalanen
Date: Tuesday, February 26, 2008 - 1:02 pm

On Mon, 25 Feb 2008 14:49:22 -0800

I checked that, but I didn't think any of them were worth fixing. And
since this is a work in progress and a in a state which I do not want

I believe the 'volatile' belongs here, it is also in the declaration of

These are all in testmmiotrace.c which calls the traced ioremap
functions directly. These direct calls will go away and it will

Since testmmiotrace does not use v for anything other than return value
of ioread8/16/32(), I wanted to prevent the compiler from optimizing
it away. Now thinking it again, ioread*() must guarantee that the read
is always executed. I'll remove v altogether.

Patch for the other issues you mentioned is brewing.


Thanks.

-- 
Pekka Paalanen
http://www.iki.fi/pq/
--

From: Jonathan Corbet
Date: Tuesday, February 26, 2008 - 10:20 am

Hey, Pekka,


Should that text read something like:

	if (condition != DIE_TRAP || !ctx->active)

Presumably you won't be active if something else is going wrong, but one

That only checks the first page; if the probed region partially overlaps
another one found later in memory, the registration will succeed.

Maybe you want to decrement kmmio_count if you decide to return -EEXIST
(or hold off on the increment until after the test)?

In general, I worry about what happens if an interrupt handler generates
traced MMIO traffic while a fault handler is active.  It looks a lot
like the "all hell breaks loose" scenario mentioned in the comments.  Is
there some way of preventing that which I missed?  Otherwise, maybe,
should the ioremap() wrappers take an additional argument, being an IRQ
to disable while trace handlers are active?

jon
--

From: Pekka Paalanen
Date: Wednesday, February 27, 2008 - 1:28 pm

On Tue, 26 Feb 2008 10:20:08 -0700

The last thing in kmmio.c is:

static int kmmio_die_notifier(struct notifier_block *nb, unsigned long val,
								void *args)
{
	struct die_args *arg = args;

	if (val == DIE_DEBUG)
		if (post_kmmio_handler(arg->err, arg->regs) == 1)
			return NOTIFY_STOP;

	return NOTIFY_DONE;
}

so post_kmmio_handler() is not the die_notifier callback, and the handler
is limited to debug calls, which I assume can only come from
single-stepping. The variable 'condition' is something that on x86_64 comes
from get_debugreg(condition, 6); in arch/x86/kernel/traps_64.c do_debug().

True. I am assuming ioremap*() cannot return regions that overlap in
their kernel virtual addresses. Clearly this does not hold for the ISA
region, but tracing the ISA region has triggered hard freezes, so we
specifically exclude that.

Even if ioremap*() would return overlapping areas, I don't think it would
break things. It just means there are two probes on the same area. If an
MMIO access hits that area, the "first" probe gets selected and executed.
The only thing lost is which probe was actually hit, but then there's no
way to know that, anyway.

kmmio_fault_page objects are shared and reference-counted, so they should

Yes, good catch. Then again, maybe I would not even need the failure case
at all. I think I'll keep it there, though, maybe it catches something

I have thought of something like this, how could it happen. My current
understanding is the following.

kmmio_handler(), where-ever it was triggered, executes with interrupts
disabled. It modifies the saved CPU state: disable interrupts and
enable single-stepping (what's the correct term?). After disarming
the page, it returns. The faulted instruction is single-stepped,
all the time the interrupts are disabled. It hits the debug trap and
we end up in post_kmmio_handler(), still interrupts disabled. The
post handler does its thing, restores the CPU state changed in
kmmio_handler(), and ...
Previous thread: none

Next thread: Re: IDE cdrom problem with PLEXTOR DVDR PX-608AL by Borislav Petkov on Sunday, February 24, 2008 - 10:34 am. (10 messages)